Search | arXiv e-print repository

Renal digital pathology visual knowledge search platform based on language large model and book knowledge

Authors: Xiaomin Lv, Chong Lai, Liya Ding, Maode Lai, Qingrong Sun

Abstract: Large models have become mainstream, yet their applications in digital pathology still require exploration. Meanwhile renal pathology images play an important role in the diagnosis of renal diseases. We conducted image segmentation and paired corresponding text descriptions based on 60 books for renal pathology, clustering analysis for all image and text description features based on large models,… ▽ More Large models have become mainstream, yet their applications in digital pathology still require exploration. Meanwhile renal pathology images play an important role in the diagnosis of renal diseases. We conducted image segmentation and paired corresponding text descriptions based on 60 books for renal pathology, clustering analysis for all image and text description features based on large models, ultimately building a retrieval system based on the semantic features of large models. Based above analysis, we established a knowledge base of 10,317 renal pathology images and paired corresponding text descriptions, and then we evaluated the semantic feature capabilities of 4 large models, including GPT2, gemma, LLma and Qwen, and the image-based feature capabilities of dinov2 large model. Furthermore, we built a semantic retrieval system to retrieve pathological images based on text descriptions, and named RppD (aidp.zjsru.edu.cn). △ Less

Submitted 26 May, 2024; originally announced June 2024.

Comments: 9 pages, 6 figures

arXiv:2405.09552 [pdf, other]

ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Authors: Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

Abstract: Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose seman… ▽ More Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose semantic segmentation methods using convolutional neural networks (CNNs) and Transformers, there is currently a lack of benchmarks for these state-of-the-art (SoTA) networks specifically trained for ONH detection. Therefore, in this article, we make contributions from three key aspects: network design, the publication of a dataset, and the establishment of a comprehensive benchmark. Our newly developed ONH detection network, referred to as ODFormer, is based upon the Swin Transformer architecture and incorporates two novel components: a multi-scale context aggregator and a lightweight bidirectional feature recalibrator. Our published large-scale dataset, known as TongjiU-DROD, provides multi-resolution fundus images for each participant, captured using two distinct types of cameras. Our established benchmark involves three datasets: DRIONS-DB, DRISHTI-GS1, and TongjiU-DROD, created by researchers from different countries and containing fundus images captured from participants of diverse races and ages. Extensive experimental results demonstrate that our proposed ODFormer outperforms other state-of-the-art (SoTA) networks in terms of performance and generalizability. Our dataset and source code are publicly available at mias.group/ODFormer. △ Less

Submitted 2 June, 2024; v1 submitted 15 April, 2024; originally announced May 2024.

arXiv:2303.05745 [pdf, other]

Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation

Authors: Minghui Zhang, Yangqian Wu, Hanxiao Zhang, Yulei Qin, Hao Zheng, Wen Tang, Corey Arnold, Chenhao Pei, Pengxin Yu, Yang Nan, Guang Yang, Simon Walsh, Dominic C. Marshall, Matthieu Komorowski, Puyang Wang, Dazhou Guo, Dakai **, Ya'nan Wu, Shuiqing Zhao, Runsheng Chang, Boyu Zhang, Xing Lv, Abdul Qayyum, Moona Mazher, Qi Su , et al. (11 additional authors not shown)

Abstract: Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms drive… ▽ More Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and clinical drive for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage. △ Less

Submitted 27 June, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

Comments: 32 pages, 16 figures. Homepage: https://atm22.grand-challenge.org/. Submitted

arXiv:2212.04908 [pdf]

doi 10.1631/FITEE.2200666

Reconfigurable Intelligent Surfaces for 6G -- Applications, Challenges and Solutions

Authors: Yajun Zhao, Xin Lv

Abstract: It is expected that scholars will continuously strengthen the depth and breadth of theoretical research on RIS, and provide a higher theoretical upper bound for the engineering application of RIS. While making breakthroughs in academic research, it has also made rapid progress in engineering application research and industrialization promotion. This paper will provide an overview of RIS engineerin… ▽ More It is expected that scholars will continuously strengthen the depth and breadth of theoretical research on RIS, and provide a higher theoretical upper bound for the engineering application of RIS. While making breakthroughs in academic research, it has also made rapid progress in engineering application research and industrialization promotion. This paper will provide an overview of RIS engineering applications, and make a systematic and in-depth analysis of the challenges and candidate solutions of RIS engineering applications. Future trends and challenges are also provided. △ Less

Submitted 3 December, 2022; originally announced December 2022.

Comments: 22 pages. Frontiers of Information Technology & Electronic Engineering, 2023

arXiv:2209.01740 [pdf]

A Multi-scale Video Denoising Algorithm for Raw Image

Authors: Bin Ma, Yueli Hu, Xianxian Lv, Kai Li

Abstract: Video denoising for raw image has always been the difficulty of camera image processing. On the one hand, image denoising performance largely determines the image quality, moreover denoising effect in raw image will affect the accuracy of the following operations of ISP processing flow. On the other hand, compared with image, video have motion information in time sequence, thus motion estimation w… ▽ More Video denoising for raw image has always been the difficulty of camera image processing. On the one hand, image denoising performance largely determines the image quality, moreover denoising effect in raw image will affect the accuracy of the following operations of ISP processing flow. On the other hand, compared with image, video have motion information in time sequence, thus motion estimation which is complex and computationally expensive is needed in video denoising. In view of the above problems, this paper proposes a video denoising algorithm for raw image, performing multiple cascading processing stages on raw-RGB image based on convolutional neural network, and carries out implicit motion estimation in the network. The denoising performance is far superior to that of traditional algorithms with minimal computation and bandwidth, and has computational advantages compared with most deep learning algorithms. △ Less

Submitted 4 September, 2022; originally announced September 2022.

arXiv:2208.00875 [pdf]

doi 10.1109/ACCESS.2022.3183139

Network Coexistence Analysis of RIS-Assisted Wireless Communications

Authors: Yajun Zhao, Xin Lv

Abstract: Reconfigurable intelligent surfaces (RISs) have attracted the attention of academia and industry circles because of their ability to control the electromagnetic characteristics of channel environments. However, it has been found that the introduction of an RIS may bring new and more serious network coexistence problems. It may even further deteriorate the network performance if these new network c… ▽ More Reconfigurable intelligent surfaces (RISs) have attracted the attention of academia and industry circles because of their ability to control the electromagnetic characteristics of channel environments. However, it has been found that the introduction of an RIS may bring new and more serious network coexistence problems. It may even further deteriorate the network performance if these new network coexistence problems cannot be effectively solved. In this paper, an RIS network coexistence model is proposed and discussed in detail, and these problems are deeply analysed. Two novel RIS design mechanisms, including a novel multilayer RIS structure with an out-of-band filter and an RIS blocking mechanism, are further explored. Finally, numerical results and a discussion are given. △ Less

Submitted 27 July, 2022; originally announced August 2022.

Comments: 17 pages, 16 figures

Journal ref: IEEE ACCESS VOLUME 10, 2022

arXiv:2207.07627 [pdf, other]

Dual-space Compressed Sensing

Authors: Xudong Lv, Ashok Ajoy

Abstract: Compressed sensing (CS) is a powerful method routinely employed to accelerate image acquisition. It is particularly suited to situations when the image under consideration is sparse but can be sampled in a basis where it is non-sparse. Here we propose an alternate CS regime in situations where the image can be sampled in two incoherent spaces simultaneously, with a special focus on image sampling… ▽ More Compressed sensing (CS) is a powerful method routinely employed to accelerate image acquisition. It is particularly suited to situations when the image under consideration is sparse but can be sampled in a basis where it is non-sparse. Here we propose an alternate CS regime in situations where the image can be sampled in two incoherent spaces simultaneously, with a special focus on image sampling in Fourier reciprocal spaces (e.g. real-space and k-space). Information is fed-forward from one space to the other, allowing new opportunities to efficiently solve the optimization problem at the heart of CS image reconstruction. We show that considerable gains in imaging acceleration are then possible over conventional CS. The technique provides enhanced robustness to noise, and is well suited to edge-detection problems. We envision applications for imaging collections of nanodiamond (ND) particles targeting specific regions in a volume of interest, exploiting the ability of lattice defects (NV centers) to allow ND particles to be imaged in reciprocal spaces simultaneously via optical fluorescence and 13C magnetic resonance imaging (MRI) respectively. Broadly this work suggests the potential to interface CS principles with hybrid sampling strategies to yield speedup in signal acquisition in many practical settings. △ Less

Submitted 15 July, 2022; originally announced July 2022.

arXiv:2206.06784 [pdf, other]

Stochastic Event-triggered Variational Bayesian Filtering

Authors: Xiaoxu Lv, Peihu Duan, Zhisheng Duan, Guanrong Chen, Ling Shi

Abstract: This paper proposes an event-triggered variational Bayesian filter for remote state estimation with unknown and time-varying noise covariances. After presetting multiple nominal process noise covariances and an initial measurement noise covariance, a variational Bayesian method and a fixed-point iteration method are utilized to jointly estimate the posterior state vector and the unknown noise cova… ▽ More This paper proposes an event-triggered variational Bayesian filter for remote state estimation with unknown and time-varying noise covariances. After presetting multiple nominal process noise covariances and an initial measurement noise covariance, a variational Bayesian method and a fixed-point iteration method are utilized to jointly estimate the posterior state vector and the unknown noise covariances under a stochastic event-triggered mechanism. The proposed algorithm ensures low communication loads and excellent estimation performances for a wide range of unknown noise covariances. Finally, the performance of the proposed algorithm is demonstrated by tracking simulations of a vehicle. △ Less

Submitted 14 June, 2022; originally announced June 2022.

arXiv:2202.04855 [pdf, ps, other]

The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge

Authors: Maokui He, Xiang Lv, Weilin Zhou, **g**g Yin, Xiaoqi Zhang, Yuxuan Wang, Shutong Niu, Yuhang Cao, Heng Lu, Jun Du, Chin-Hui Lee

Abstract: We propose two improvements to target-speaker voice activity detection (TS-VAD), the core component in our proposed speaker diarization system that was submitted to the 2022 Multi-Channel Multi-Party Meeting Transcription (M2MeT) challenge. These techniques are designed to handle multi-speaker conversations in real-world meeting scenarios with high speaker-overlap ratios and under heavy reverberan… ▽ More We propose two improvements to target-speaker voice activity detection (TS-VAD), the core component in our proposed speaker diarization system that was submitted to the 2022 Multi-Channel Multi-Party Meeting Transcription (M2MeT) challenge. These techniques are designed to handle multi-speaker conversations in real-world meeting scenarios with high speaker-overlap ratios and under heavy reverberant and noisy condition. First, for data preparation and augmentation in training TS-VAD models, speech data containing both real meetings and simulated indoor conversations are used. Second, in refining results obtained after TS-VAD based decoding, we perform a series of post-processing steps to improve the VAD results needed to reduce diarization error rates (DERs). Tested on the ALIMEETING corpus, the newly released Mandarin meeting dataset used in M2MeT, we demonstrate that our proposed system can decrease the DER by up to 66.55/60.59% relatively when compared with classical clustering based diarization on the Eval/Test set. △ Less

Submitted 10 February, 2022; originally announced February 2022.

arXiv:2007.09452 [pdf, ps, other]

doi 10.1049/cth2.12075

Achieving Optimal Output Consensus for Discrete-time Linear Multi-agent Systems with Disturbance Rejection

Authors: Yutao Tang, Hao Zhu, Xiaoyong Lv

Abstract: In this paper, an optimal output consensus problem is studied for discrete-time linear multiagent systems subject to external disturbances. Each agent is assigned with a local cost function which is known only to itself. Distributed protocols are to be designed to guarantee an output consensus for these high-order agents and meanwhile minimize the aggregate cost as the sum of these local costs. To… ▽ More In this paper, an optimal output consensus problem is studied for discrete-time linear multiagent systems subject to external disturbances. Each agent is assigned with a local cost function which is known only to itself. Distributed protocols are to be designed to guarantee an output consensus for these high-order agents and meanwhile minimize the aggregate cost as the sum of these local costs. To overcome the difficulties brought by high-order dynamics and external disturbances, we develop an embedded design and constructively present a distributed rule to solve this problem. The proposed control includes three terms: an optimal signal generator under a directed information graph, an observer-based compensator to reject these disturbances, and a reference tracking controller for these linear agents. It is shown to solve the formulated problem with some mild assumptions. A numerical example is also provided to illustrate the effectiveness of our proposed distributed control laws. △ Less

Submitted 25 December, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

Comments: 15 pages, 3 figures

MSC Class: 93A16; 93A13

arXiv:1912.01203 [pdf]

Music Style Classification with Compared Methods in XGB and BPNN

Authors: Lifeng Tan, Cong **, Zhiyuan Cheng, Xin Lv, Leiyu Song

Abstract: Scientists have used many different classification methods to solve the problem of music classification. But the efficiency of each classification is different. In this paper, we propose two compared methods on the task of music style classification. More specifically, feature extraction for representing timbral texture, rhythmic content and pitch content are proposed. Comparative evaluations on p… ▽ More Scientists have used many different classification methods to solve the problem of music classification. But the efficiency of each classification is different. In this paper, we propose two compared methods on the task of music style classification. More specifically, feature extraction for representing timbral texture, rhythmic content and pitch content are proposed. Comparative evaluations on performances of two classifiers were conducted for music classification with different styles. The result shows that XGB is better suited for small datasets than BPNN △ Less

Submitted 3 December, 2019; originally announced December 2019.

Comments: 5 pages, 1 figures

Showing 1–11 of 11 results for author: Lv, X