Skip to main content

Showing 1–50 of 76 results for author: Fu, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.16848  [pdf, other

    eess.IV cs.CV

    Unsupervised Domain Adaptation for Pediatric Brain Tumor Segmentation

    Authors: **gru Fu, Simone Bendazzoli, Örjan Smedby, Rodrigo Moreno

    Abstract: Significant advances have been made toward building accurate automatic segmentation models for adult gliomas. However, the performance of these models often degrades when applied to pediatric glioma due to their imaging and clinical differences (domain shift). Obtaining sufficient annotated data for pediatric glioma is typically difficult because of its rare nature. Also, manual annotations are sc… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures, conference

  2. arXiv:2406.14069  [pdf, other

    eess.IV cs.CV

    Towards Multi-modality Fusion and Prototype-based Feature Refinement for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound

    Authors: Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zou, Jianhua Zhou, Yi Wang

    Abstract: Prostate cancer is a highly prevalent cancer and ranks as the second leading cause of cancer-related deaths in men globally. Recently, the utilization of multi-modality transrectal ultrasound (TRUS) has gained significant traction as a valuable technique for guiding prostate biopsies. In this study, we propose a novel learning framework for clinically significant prostate cancer (csPCa) classifica… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2405.17270  [pdf, other

    eess.SP

    Towards Accurate Ego-lane Identification with Early Time Series Classification

    Authors: Yuchuan **, Theodor Stenhammar, David Bejmer, Axel Beauvisage, Yuxuan Xia, Junsheng Fu

    Abstract: Accurate and timely determination of a vehicle's current lane within a map is a critical task in autonomous driving systems. This paper utilizes an Early Time Series Classification (ETSC) method to achieve precise and rapid ego-lane identification in real-world driving data. The method begins by assessing the similarities between map and lane markings perceived by the vehicle's camera using measur… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures

  4. arXiv:2405.04290  [pdf, other

    cs.RO eess.SP

    Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map

    Authors: Yuxuan Xia, Erik Stenborg, Junsheng Fu, Gustaf Hendeby

    Abstract: High-definition map with accurate lane-level information is crucial for autonomous driving, but the creation of these maps is a resource-intensive process. To this end, we present a cost-effective solution to create lane-level roadmaps using only the global navigation satellite system (GNSS) and a camera on customer vehicles. Our proposed solution utilizes a prior standard-definition (SD) map, GNS… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 27th International Conference on Information Fusion

  5. arXiv:2405.00157  [pdf, other

    eess.SY

    Information-Theoretic Opacity-Enforcement in Markov Decision Processes

    Authors: Chongyang Shi, Yuheng Bu, Jie Fu

    Abstract: The paper studies information-theoretic opacity, an information-flow privacy property, in a setting involving two agents: A planning agent who controls a stochastic system and an observer who partially observes the system states. The goal of the observer is to infer some secret, represented by a random variable, from its partial observations, while the goal of the planning agent is to make the sec… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  6. arXiv:2404.18081  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    ComposerX: Multi-Agent Symbolic Music Composition with LLMs

    Authors: Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo

    Abstract: Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and C… ▽ More

    Submitted 30 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  7. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  8. arXiv:2404.01170  [pdf, other

    cs.RO eess.IV

    Force-EvT: A Closer Look at Robotic Gripper Force Measurement with Event-based Vision Transformer

    Authors: Qianyu Guo, Ziqing Yu, Jiaming Fu, Yawen Lu, Yahya Zweiri, Dongming Gan

    Abstract: Robotic grippers are receiving increasing attention in various industries as essential components of robots for interacting and manipulating objects. While significant progress has been made in the past, conventional rigid grippers still have limitations in handling irregular objects and can damage fragile objects. We have shown that soft grippers offer deformability to adapt to a variety of objec… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 6 pages, 5 figures

  9. arXiv:2403.14778  [pdf, other

    cs.CV eess.IV

    Diffusion Attack: Leveraging Stable Diffusion for Naturalistic Image Attacking

    Authors: Qianyu Guo, Jiaming Fu, Yawen Lu, Dongming Gan

    Abstract: In Virtual Reality (VR), adversarial attack remains a significant security threat. Most deep learning-based methods for physical and digital adversarial attacks focus on enhancing attack performance by crafting adversarial examples that contain large printable distortions that are easy for human observers to identify. However, attackers rarely impose limitations on the naturalness and comfort of t… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted to IEEE VRW

  10. arXiv:2403.04945  [pdf, other

    cs.CL cs.LG eess.SP

    MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation

    Authors: Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Zhenwu Peng, Jie Fu, Rossella Arcucci, Huaxiu Yao, Mi Zhang

    Abstract: Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions and is crucial in assisting clinicians. Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation, which is time-consuming and requires clinical expertise. To automate ECG report generation and ensure its versatility, we propose the… ▽ More

    Submitted 18 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Under review

  11. arXiv:2402.16153  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    ChatMusician: Understanding and Generating Music Intrinsically with LLM

    Authors: Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Ziyang Ma, Liumeng Xue, Ziyu Wang, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Pengfei Li, **gcheng Wu, Chenghua Lin, Qifeng Liu , et al. (10 additional authors not shown)

    Abstract: While Large Language Models (LLMs) demonstrate impressive capabilities in text generation, we find that their ability has yet to be generalized to music, humanity's creative language. We introduce ChatMusician, an open-source LLM that integrates intrinsic musical abilities. It is based on continual pre-training and finetuning LLaMA2 on a text-compatible music representation, ABC notation, and the… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: GitHub: https://shanghaicannon.github.io/ChatMusician/

  12. arXiv:2402.12103  [pdf, ps, other

    eess.SP

    Interference Mitigation in LEO Constellations with Limited Radio Environment Information

    Authors: Fernando Moya Caceres, Akram Al-Hourani, Saman Atapattu, Michael Aygur, Sithamparanathan Kandeepan, **g Fu, Ke Wang, Wayne S. T. Rowe, Mark Bowyer, Zarko Krusevac, Edward Arbon

    Abstract: This research paper delves into interference mitigation within Low Earth Orbit (LEO) satellite constellations, particularly when operating under constraints of limited radio environment information. Leveraging cognitive capabilities facilitated by the Radio Environment Map (REM), we explore strategies to mitigate the impact of both intentional and unintentional interference using planar antenna ar… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 6 pages, 12 figures, IEEE ICC 2024

  13. arXiv:2402.12015  [pdf, other

    eess.SY cs.LG

    An Index Policy Based on Sarsa and Q-learning for Heterogeneous Smart Target Tracking

    Authors: Yuhang Hao, Zengfu Wang, **g Fu, Quan Pan

    Abstract: In solving the non-myopic radar scheduling for multiple smart target tracking within an active and passive radar network, we need to consider both short-term enhanced tracking performance and a higher probability of target maneuvering in the future with active tracking. Acquiring the long-term tracking performance while scheduling the beam resources of active and passive radars poses a challenge.… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 11 pages

  14. arXiv:2402.09463  [pdf

    eess.IV

    Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results

    Authors: Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, ** Ye, Mireia Alenyà, Valentin Comte, Oscar Camara , et al. (42 additional authors not shown)

    Abstract: Segmentation is a critical step in analyzing the develo** human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, and the generalizability of algorithms across dif… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Results from FeTA Challenge 2022, held at MICCAI; Manuscript submitted. Supplementary Info (including submission methods descriptions) available here: https://zenodo.org/records/10628648

  15. arXiv:2402.08987  [pdf, other

    eess.IV cs.CV

    Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer

    Authors: Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wang

    Abstract: Prostate cancer is the most common noncutaneous cancer in the world. Recently, multi-modality transrectal ultrasound (TRUS) has increasingly become an effective tool for the guidance of prostate biopsies. With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos. The frame… ▽ More

    Submitted 17 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  16. arXiv:2312.15701  [pdf, other

    eess.IV cs.CV cs.LG

    Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

    Authors: Jiahong Fu, Qi Xie, Deyu Meng, Zongben Xu

    Abstract: The deep unfolding approach has attracted significant attention in computer vision tasks, which well connects conventional image processing modeling manners with more recent deep learning techniques. Specifically, by establishing a direct correspondence between algorithm operators at each implementation step and network modules within each layer, one can rationally construct an almost ``white box'… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  17. arXiv:2312.09585  [pdf, other

    eess.SY cs.AI cs.LG

    Joint State Estimation and Noise Identification Based on Variational Optimization

    Authors: Hua Lan, Shijie Zhao, **jie Hu, Zengfu Wang, **g Fu

    Abstract: In this article, the state estimation problems with unknown process noise and measurement noise covariances for both linear and nonlinear systems are considered. By formulating the joint estimation of system state and noise parameters into an optimization problem, a novel adaptive Kalman filter method based on conjugate-computation variational inference, referred to as CVIAKF, is proposed to appro… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 13 pages

  18. arXiv:2312.09576  [pdf, other

    eess.IV cs.CV

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Authors: Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, ** Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein , et al. (17 additional authors not shown)

    Abstract: Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023)

  19. arXiv:2312.07858  [pdf, other

    eess.SY

    Non-myopic Beam Scheduling for Multiple Smart Target Tracking in Phased Array Radar Network

    Authors: Yuhang Hao, Zengfu Wang, José Niño-Mora, **g Fu, Min Yang, Quan Pan

    Abstract: A smart target, also referred to as a reactive target, can take maneuvering motions to hinder radar tracking. We address beam scheduling for tracking multiple smart targets in phased array radar networks. We aim to mitigate the performance degradation in previous myopic tracking methods and enhance the system performance, which is measured by a discounted cost objective related to the tracking err… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 14 pages

  20. arXiv:2312.01529  [pdf, other

    cs.CV cs.CL cs.LG eess.IV

    T3D: Towards 3D Medical Image Understanding through Vision-Language Pre-training

    Authors: Che Liu, Cheng Ouyang, Yinda Chen, Cesar César Quilodrán-Casas, Lei Ma, Jie Fu, Yike Guo, Anand Shah, Wenjia Bai, Rossella Arcucci

    Abstract: Expert annotation of 3D medical image for downstream analysis is resource-intensive, posing challenges in clinical applications. Visual self-supervised learning (vSSL), though effective for learning visual invariance, neglects the incorporation of domain knowledge from medicine. To incorporate medical knowledge into visual representation learning, vision-language pre-training (VLP) has shown promi… ▽ More

    Submitted 5 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

  21. arXiv:2310.05938  [pdf, other

    cs.CV cs.AI cs.SD eess.AS

    Component attention network for multimodal dance improvisation recognition

    Authors: Jia Fu, Jiarui Tan, Wenjie Yin, Sepideh Pashami, Mårten Björkman

    Abstract: Dance improvisation is an active research topic in the arts. Motion analysis of improvised dance can be challenging due to its unique dynamics. Data-driven dance motion analysis, including recognition and generation, is often limited to skeletal data. However, data of other modalities, such as audio, can be recorded and benefit downstream tasks. This paper explores the application and performance… ▽ More

    Submitted 24 August, 2023; originally announced October 2023.

    Comments: Accepted to 25th ACM International Conference on Multimodal Interaction (ICMI 2023)

    ACM Class: I.2; I.5.4

  22. arXiv:2309.03519  [pdf, other

    math.OC cs.MA eess.SY

    A cutting-surface consensus approach for distributed robust optimization of multi-agent systems

    Authors: Jun Fu, Xunhao Wu

    Abstract: A novel and fully distributed optimization method is proposed for the distributed robust convex program (DRCP) over a time-varying unbalanced directed network under the uniformly jointly strongly connected (UJSC) assumption. Firstly, a tractable approximated DRCP (ADRCP) is introduced by discretizing the semi-infinite constraints into a finite number of inequality constraints and restricting the r… ▽ More

    Submitted 15 June, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Submitted to IEEE Transactions on Automatic Control

  23. arXiv:2309.01201  [pdf, other

    math.OC cs.MA eess.SY

    Distributed robust optimization for multi-agent systems with guaranteed finite-time convergence

    Authors: Xunhao Wu, Jun Fu

    Abstract: A novel distributed algorithm is proposed for finite-time converging to a feasible consensus solution satisfying global optimality to a certain accuracy of the distributed robust convex optimization problem (DRCO) subject to bounded uncertainty under a uniformly strongly connected network. Firstly, a distributed lower bounding procedure is developed, which is based on an outer iterative approximat… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: Submitted for publication in Automatica

  24. arXiv:2307.05161  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    On the Effectiveness of Speech Self-supervised Learning for Music

    Authors: Yinghao Ma, Ruibin Yuan, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Ruibo Liu, Gus Xia, Roger Dannenberg, Yike Guo, Jie Fu

    Abstract: Self-supervised learning (SSL) has shown promising results in various speech and natural language processing applications. However, its efficacy in music information retrieval (MIR) still remains largely unexplored. While previous SSL models pre-trained on music recordings may have been mostly closed-sourced, recent speech models such as wav2vec2.0 have shown promise in music modelling. Neverthele… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  25. arXiv:2306.17103  [pdf, other

    cs.CL cs.SD eess.AS

    LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

    Authors: Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi LI, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

    Abstract: We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal. Our novel, training-free approach utilizes Whisper, a weakly supervised robust speech recognition model, and GPT-4, today's most performant chat-based large language mo… ▽ More

    Submitted 21 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 9 pages, 2 figures, 5 tables, accepted by ISMIR 2023

  26. arXiv:2306.10805  [pdf

    physics.med-ph cs.CV eess.IV

    Experts' cognition-driven ensemble deep learning for external validation of predicting pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer

    Authors: Yongquan Yang, Fengling Li, Yani Wei, Yuanyuan Zhao, **g Fu, Xiuli Xiao, Hong Bu

    Abstract: In breast cancer imaging, there has been a trend to directly predict pathological complete response (pCR) to neoadjuvant chemotherapy (NAC) from histological images based on deep learning (DL). However, it has been a commonly known problem that the constructed DL-based models numerically have better performances in internal validation than in external validation. The primary reason for this situat… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  27. arXiv:2306.10548  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    MARBLE: Music Audio Representation Benchmark for Universal Evaluation

    Authors: Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu

    Abstract: In the era of extensive intersection between art and Artificial Intelligence (AI), such as image generation and fiction co-creation, AI for music remains relatively nascent, particularly in music understanding. This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark. To address this issue… ▽ More

    Submitted 23 November, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: camera-ready version for NeurIPS 2023

  28. arXiv:2306.09710  [pdf, other

    eess.SY

    Combinatorial-restless-bandit-based Transmitter-Receiver Online Selection for Distributed MIMO Radars With Non-Stationary Channels

    Authors: Yuhang Hao, Zengfu Wang, **g Fu, Xianglong Bai, Can Li, Quan Pan

    Abstract: We track moving targets with a distributed multiple-input multiple-output (MIMO) radar, for which the transmitters and receivers are appropriately paired and selected with a limited number of radar stations. We aim to maximize the sum of the signal-to-interference-plus-noise ratios (SINRs) of all the targets by sensibly selecting the transmitter-receiver pairs during the tracking period. A key is… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 13 pages

  29. arXiv:2306.02398  [pdf, other

    cs.CV eess.IV

    Scale Guided Hypernetwork for Blind Super-Resolution Image Quality Assessment

    Authors: Jun Fu

    Abstract: With the emergence of image super-resolution (SR) algorithm, how to blindly evaluate the quality of super-resolution images has become an urgent task. However, existing blind SR image quality assessment (IQA) metrics merely focus on visual characteristics of super-resolution images, ignoring the available scale information. In this paper, we reveal that the scale factor has a statistically signifi… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: new framework for blind super-resolution image quality assessment

  30. arXiv:2306.00107  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

    Authors: Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghao Xiao, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, Roger Dannenberg, Ruibo Liu, Wenhu Chen, Gus Xia, Yemin Shi, Wenhao Huang, Zili Wang, Yike Guo, Jie Fu

    Abstract: Self-supervised learning (SSL) has recently emerged as a promising paradigm for training generalisable models on large-scale data in the fields of vision, text, and speech. Although SSL has been proven effective in speech and audio, its application to music audio has yet to be thoroughly explored. This is partially due to the distinctive challenges associated with modelling musical knowledge, part… ▽ More

    Submitted 22 April, 2024; v1 submitted 31 May, 2023; originally announced June 2023.

    Comments: accepted by ICLR 2024

  31. arXiv:2305.15985  [pdf, other

    cs.IT eess.SP

    Resource Allocation in Cell-Free MU-MIMO Multicarrier System with Finite and Infinite Blocklength

    Authors: Jiafei Fu, Pengcheng Zhu, Bo Ai, Jiangzhou Wang, Xiaohu You

    Abstract: The explosive growth of data results in more scarce spectrum resources. It is important to optimize the system performance under limited resources. In this paper, we investigate how to achieve weighted throughput (WTP) maximization for cell-free (CF) multiuser MIMO (MU-MIMO) multicarrier (MC) systems through resource allocation (RA), in the cases of finite blocklength (FBL) and infinite blocklengt… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  32. arXiv:2305.15357  [pdf, other

    eess.IV cs.CV cs.LG

    Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution

    Authors: Yiyang Ma, Huan Yang, Wenhan Yang, Jianlong Fu, Jiaying Liu

    Abstract: Diffusion models, as a kind of powerful generative model, have given impressive results on image super-resolution (SR) tasks. However, due to the randomness introduced in the reverse process of diffusion models, the performances of diffusion-based SR models are fluctuating at every time of sampling, especially for samplers with few resampled steps. This inherent randomness of diffusion models resu… ▽ More

    Submitted 1 April, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted by ICLR 2024

  33. arXiv:2304.01286  [pdf, other

    eess.SY

    Synthesis of Opacity-Enforcing Winning Strategies Against Colluded Opponent

    Authors: Chongyang Shi, Abhishek N. Kulkarni, Hazhar Rahmani, Jie Fu

    Abstract: This paper studies a language-based opacity enforcement in a two-player, zero-sum game on a graph. In this game, player 1 (P1) wins if it can achieve a secret temporal goal described by the language of a finite automaton, no matter what strategy the opponent player 2 (P2) selects. In addition, P1 aims to win while making its goal opaque to a passive observer with imperfect information. However, P2… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  34. Optimization of the energy efficiency in Smart Internet of Vehicles assisted by MEC

    Authors: Jiafei Fu, Pengcheng Zhu, **gyu Hua, Jiamin Li, Jiangang Wen

    Abstract: Smart Internet of Vehicles (IoV) as a promising application in Internet of Things (IoT) emerges with the development of the fifth generation mobile communication (5G). Nevertheless, the heterogeneous requirements of sufficient battery capacity, powerful computing ability and energy efficiency for electric vehicles face great challenges due to the explosive data growth in 5G and the sixth generatio… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: 17 pages, 9 figures, EURASIP J. Adv. Signal Process

    Journal ref: EURASIP J. Adv. Signal Process. 2022 (2022) 13

  35. arXiv:2301.01349  [pdf, other

    eess.SY

    Quantitative Planning with Action Deception in Concurrent Stochastic Games

    Authors: Chongyang Shi, Shuo Han, Jie Fu

    Abstract: We study a class of two-player competitive concurrent stochastic games on graphs with reachability objectives. Specifically, player 1 aims to reach a subset $F_1$ of game states, and player 2 aims to reach a subset $F_2$ of game states where $F_2\cap F_1=\emptyset$. Both players aim to satisfy their reachability objectives before their opponent does. Yet, the information players have about the gam… ▽ More

    Submitted 22 March, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

  36. arXiv:2212.14046  [pdf, other

    eess.IV cs.AI cs.CV

    Learning Spatiotemporal Frequency-Transformer for Low-Quality Video Super-Resolution

    Authors: Zhongwei Qiu, Huan Yang, Jianlong Fu, Daochang Liu, Chang Xu, Dongmei Fu

    Abstract: Video Super-Resolution (VSR) aims to restore high-resolution (HR) videos from low-resolution (LR) videos. Existing VSR techniques usually recover HR frames by extracting pertinent textures from nearby frames with known degradation processes. Despite significant progress, grand challenges are remained to effectively extract and transmit high-quality textures from high-degraded low-quality sequences… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

  37. arXiv:2212.02508  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

    Authors: Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Chenghua Lin, Xingran Chen, Anton Ragni, Hanzhi Yin, Zhijie Hu, Haoyu He, Emmanouil Benetos, Norbert Gyenge, Ruibo Liu, Jie Fu

    Abstract: The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL). However, it still remains unexplored how to build a framework for learning useful representations of raw music waveforms in a self-supervised manner. In this work, we design Music2Vec, a framework exploring different SSL algorithmic components and tricks for music audio recordings. Our mo… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  38. arXiv:2210.01878  [pdf, other

    cs.AI cs.GT cs.RO eess.SY

    Opportunistic Qualitative Planning in Stochastic Systems with Incomplete Preferences over Reachability Objectives

    Authors: Abhishek N. Kulkarni, Jie Fu

    Abstract: Preferences play a key role in determining what goals/constraints to satisfy when not all constraints can be satisfied simultaneously. In this paper, we study how to synthesize preference satisfying plans in stochastic systems, modeled as an MDP, given a (possibly incomplete) combinative preference model over temporally extended goals. We start by introducing new semantics to interpret preferences… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: 7 pages, 3 figures, under review for IEEE ACC 2023

  39. arXiv:2209.10305  [pdf, other

    cs.CV eess.IV

    KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution

    Authors: Jiahong Fu, Hong Wang, Qi Xie, Qian Zhao, Deyu Meng, Zongben Xu

    Abstract: Although current deep learning-based methods have gained promising performance in the blind single image super-resolution (SISR) task, most of them mainly focus on heuristically constructing diverse network architectures and put less emphasis on the explicit embedding of the physical generation mechanism between blur kernels and high-resolution (HR) images. To alleviate this issue, we propose a mo… ▽ More

    Submitted 22 September, 2022; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: Accepted by ECCV2022

  40. arXiv:2209.01749  [pdf, other

    eess.IV cs.CV

    4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement

    Authors: Chengxu Liu, Huan Yang, Jianlong Fu, Xueming Qian

    Abstract: Image enhancement aims at improving the aesthetic visual quality of photos by retouching the color and tone, and is an essential technology for professional digital photography. Recent years deep learning-based image enhancement algorithms have achieved promising performance and attracted increasing popularity. However, typical efforts attempt to construct a uniform enhancer for all pixels' color… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

  41. arXiv:2207.00943  [pdf, other

    cs.CV eess.IV

    Degradation-Guided Meta-Restoration Network for Blind Super-Resolution

    Authors: Fuzhi Yang, Huan Yang, Yanhong Zeng, Jianlong Fu, Hongtao Lu

    Abstract: Blind super-resolution (SR) aims to recover high-quality visual textures from a low-resolution (LR) image, which is usually degraded by down-sampling blur kernels and additive noises. This task is extremely difficult due to the challenges of complicated image degradations in the real-world. Existing SR approaches either assume a predefined blur kernel or a fixed noise, which limits these approache… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

  42. arXiv:2206.03336  [pdf, other

    eess.IV cs.CV cs.LG

    Parotid Gland MRI Segmentation Based on Swin-Unet and Multimodal Images

    Authors: Zi'an Xu, Yin Dai, Fayu Liu, Siqi Li, Sheng Liu, Lifu Shi, Jun Fu

    Abstract: Background and objective: Parotid gland tumors account for approximately 2% to 10% of head and neck tumors. Preoperative tumor localization, differential diagnosis, and subsequent selection of appropriate treatment for parotid gland tumors are critical. However, the relative rarity of these tumors and the highly dispersed tissue types have left an unmet need for a subtle differential diagnosis of… ▽ More

    Submitted 26 December, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

  43. arXiv:2205.15607  [pdf, other

    eess.IV cs.CV

    Generative Aging of Brain Images with Diffeomorphic Registration

    Authors: **gru Fu, Antonios Tzortzakakis, José Barroso, Eric Westman, Daniel Ferreira, Rodrigo Moreno

    Abstract: Analyzing and predicting brain aging is essential for early prognosis and accurate diagnosis of cognitive diseases. The technique of neuroimaging, such as Magnetic Resonance Imaging (MRI), provides a noninvasive means of observing the aging process within the brain. With longitudinal image data collection, data-intensive Artificial Intelligence (AI) algorithms have been used to examine brain aging… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

  44. arXiv:2204.04216  [pdf, other

    eess.IV cs.CV

    Learning Trajectory-Aware Transformer for Video Super-Resolution

    Authors: Chengxu Liu, Huan Yang, Jianlong Fu, Xueming Qian

    Abstract: Video super-resolution (VSR) aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts. Although some progress has been made, there are grand challenges to effectively utilize temporal dependency in entire video sequences. Existing approaches usually align and aggregate video frames from limited adjacent frames (e.g., 5 or 7 frames), which prevents these… ▽ More

    Submitted 20 April, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: CVPR 2022 Oral

  45. arXiv:2204.01584  [pdf, other

    math.OC cs.GT eess.SY

    Synthesizing Attack-Aware Control and Active Sensing Strategies under Reactive Sensor Attacks

    Authors: Sumukha Udupa, Abhishek N. Kulkarni, Shuo Han, Nandi O. Leslie, Charles A. Kamhoua, Jie Fu

    Abstract: We consider the probabilistic planning problem for a defender (P1) who can jointly query the sensors and take control actions to reach a set of goal states while being aware of possible sensor attacks by an adversary (P2) who has perfect observations. To synthesize a provably-correct, attack-aware joint control and active sensing strategy for P1, we construct a stochastic game on graph with augmen… ▽ More

    Submitted 29 November, 2022; v1 submitted 28 March, 2022; originally announced April 2022.

    Comments: 7 pages, 3 figure, 1 table, 1 algorithm

    Journal ref: LCSS vol.7(2022)265-270

  46. arXiv:2203.13803  [pdf, other

    cs.FL cs.GT eess.SY

    Opportunistic Qualitative Planning in Stochastic Systems with Preferences over Temporal Logic Objectives

    Authors: Abhishek Ninad Kulkarni, Jie Fu

    Abstract: Preferences play a key role in determining what goals/constraints to satisfy when not all constraints can be satisfied simultaneously. In this work, we study preference-based planning in a stochastic system modeled as a Markov decision process, subject to a possible incomplete preference over temporally extended goals. Our contributions are three folds: First, we introduce a preference language to… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: 6 pages, 3 figure, submitted to IEEE L-CSS

  47. arXiv:2201.09421  [pdf, other

    cs.CV eess.IV

    Mutual Attention-based Hybrid Dimensional Network for Multimodal Imaging Computer-aided Diagnosis

    Authors: Yin Dai, Yifan Gao, Fayu Liu, Jun Fu

    Abstract: Recent works on Multimodal 3D Computer-aided diagnosis have demonstrated that obtaining a competitive automatic diagnosis model when a 3D convolution neural network (CNN) brings more parameters and medical images are scarce remains nontrivial and challenging. Considering both consistencies of regions of interest in multimodal images and diagnostic accuracy, we propose a novel mutual attention-base… ▽ More

    Submitted 23 January, 2022; originally announced January 2022.

    Comments: 11 pages, 8 figures

  48. arXiv:2111.13078  [pdf, other

    cs.CV eess.IV

    A Close Look at Few-shot Real Image Super-resolution from the Distortion Relation Perspective

    Authors: Xin Li, Xin **, Jun Fu, Xiaoyuan Yu, Bei Tong, Zhibo Chen

    Abstract: Collecting amounts of distorted/clean image pairs in the real world is non-trivial, which seriously limits the practical applications of these supervised learning-based methods on real-world image super-resolution (RealSR). Previous works usually address this problem by leveraging unsupervised learning-based technologies to alleviate the dependency on paired training samples. However, these method… ▽ More

    Submitted 18 April, 2023; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: 12 pages, first paper for few-shot real image super-resolution

  49. arXiv:2110.03032  [pdf, other

    cs.LG cs.AI cs.RO eess.SY stat.ML

    Learning Multi-Objective Curricula for Robotic Policy Learning

    Authors: Jikun Kang, Miao Liu, Abhinav Gupta, Chris Pal, Xue Liu, Jie Fu

    Abstract: Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL). They are designed to control how a DRL agent collects data, which is inspired by how humans gradually adapt their learning processes to their capabilities. For example, ACL can be used for subgoal generation, reward sha**, environment… ▽ More

    Submitted 19 October, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: CoRL 2022; Reinforcement Learning; Meta-Reinforcement Learning; Hyper-network

  50. arXiv:2108.07948  [pdf, other

    eess.IV cs.CV

    Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

    Authors: Heliang Zheng, Huan Yang, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

    Abstract: An important scenario for image quality assessment (IQA) is to evaluate image restoration (IR) algorithms. The state-of-the-art approaches adopt a full-reference paradigm that compares restored images with their corresponding pristine-quality images. However, pristine-quality images are usually unavailable in blind image restoration tasks and real-world scenarios. In this paper, we propose a pract… ▽ More

    Submitted 17 August, 2021; originally announced August 2021.

    Journal ref: ICCV 2021 Camera Ready Version