Search | arXiv e-print repository

Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms

Authors: Joshua Ashkinaze, Ruijia Guan, Laura Kurek, Eytan Adar, Ceren Budak, Eric Gilbert

Abstract: Large language models (LLMs) are trained on broad corpora and then used in communities with specialized norms. Is providing LLMs with community rules enough for models to follow these norms? We evaluate LLMs' capacity to detect (Task 1) and correct (Task 2) biased Wikipedia edits according to Wikipedia's Neutral Point of View (NPOV) policy. LLMs struggled with bias detection, achieving only 64% ac… ▽ More Large language models (LLMs) are trained on broad corpora and then used in communities with specialized norms. Is providing LLMs with community rules enough for models to follow these norms? We evaluate LLMs' capacity to detect (Task 1) and correct (Task 2) biased Wikipedia edits according to Wikipedia's Neutral Point of View (NPOV) policy. LLMs struggled with bias detection, achieving only 64% accuracy on a balanced dataset. Models exhibited contrasting biases (some under- and others over-predicted bias), suggesting distinct priors about neutrality. LLMs performed better at generation, removing 79% of words removed by Wikipedia editors. However, LLMs made additional changes beyond Wikipedia editors' simpler neutralizations, resulting in high-recall but low-precision editing. Interestingly, crowdworkers rated AI rewrites as more neutral (70%) and fluent (61%) than Wikipedia-editor rewrites. Qualitative analysis found LLMs sometimes applied NPOV more comprehensively than Wikipedia editors but often made extraneous non-NPOV-related changes (such as grammar). LLMs may apply rules in ways that resonate with the public but diverge from community experts. While potentially effective for generation, LLMs may reduce editor agency and increase moderation workload (e.g., verifying additions). Even when rules are easy to articulate, having LLMs apply them like community members may still be difficult. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2405.19821 [pdf]

Polarized sub-meV Photoluminescence in 2D PbS Nanoplatelets at Cryogenic Temperatures

Authors: Pengji Li, Leon Biesterfeld, Lars Klepzig, **gzhong Yang, Huu Thoai Ngo, Ahmed Addad, Tom N. Rakow, Ruolin Guan, Eddy P. Rugeramigabo, Louis Biadala, Jannika Lauth, Michael Zopf

Abstract: Colloidal semiconductor nanocrystals are promising materials for classical and quantum light sources due to their versatile chemistry and efficient photoluminescence (PL) properties. While visible emitters are well-established, the pursuit of excellent (near-)infrared sources continues. One notable candidate in this regard are photoluminescent two-dimensional (2D) PbS nanoplatelets (NPLs) exhibiti… ▽ More Colloidal semiconductor nanocrystals are promising materials for classical and quantum light sources due to their versatile chemistry and efficient photoluminescence (PL) properties. While visible emitters are well-established, the pursuit of excellent (near-)infrared sources continues. One notable candidate in this regard are photoluminescent two-dimensional (2D) PbS nanoplatelets (NPLs) exhibiting excitonic emission at 720 nm (1.7 eV) directly tying to the typical emission range limit of CdSe NPLs. Here, we present the first comprehensive analysis of low-temperature PL from this material class. Ultrathin 2D PbS NPLs exhibit high crystallinity confirmed by scanning transmission electron microscopy, and revealing Moire patterns in overlap** structures. At 4K, we observe unique PL features in single PbS NPLs, including narrow zero-phonon lines with line widths down to 0.6 meV and a linear degree of polarization up to 90%. Time-resolved measurements identify trions as the dominant emission source with a 2.3 ns decay time. Sub-meV spectral diffusion and no immanent blinking over minutes is observed, as well as discrete spectral jumps without memory effects. These findings advance the understanding and underpin the potential of colloidal PbS NPLs for optical and quantum technologies. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.12821 [pdf, other]

Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

Authors: Runwei Guan, Ruixiao Zhang, Ningwei Ouyang, Jianan Liu, Ka Lok Man, Xiaohao Cai, Ming Xu, Jeremy Smith, Eng Gee Lim, Yutao Yue, Hui Xiong

Abstract: Embodied perception is essential for intelligent vehicles and robots, enabling more natural interaction and task execution. However, these advancements currently embrace vision level, rarely focusing on using 3D modeling sensors, which limits the full understanding of surrounding objects with multi-granular characteristics. Recently, as a promising automotive sensor with affordable cost, 4D Millim… ▽ More Embodied perception is essential for intelligent vehicles and robots, enabling more natural interaction and task execution. However, these advancements currently embrace vision level, rarely focusing on using 3D modeling sensors, which limits the full understanding of surrounding objects with multi-granular characteristics. Recently, as a promising automotive sensor with affordable cost, 4D Millimeter-Wave radar provides denser point clouds than conventional radar and perceives both semantic and physical characteristics of objects, thus enhancing the reliability of perception system. To foster the development of natural language-driven context understanding in radar scenes for 3D grounding, we construct the first dataset, Talk2Radar, which bridges these two modalities for 3D Referring Expression Comprehension. Talk2Radar contains 8,682 referring prompt samples with 20,558 referred objects. Moreover, we propose a novel model, T-RadarNet for 3D REC upon point clouds, achieving state-of-the-art performances on Talk2Radar dataset compared with counterparts, where Deformable-FPN and Gated Graph Fusion are meticulously designed for efficient point cloud feature modeling and cross-modal fusion between radar and text features, respectively. Further, comprehensive experiments are conducted to give a deep insight into radar-based 3D REC. We release our project at https://github.com/GuanRunwei/Talk2Radar. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 8 pages, 5 figures

arXiv:2405.12434 [pdf, other]

Resolving Word Vagueness with Scenario-guided Adapter for Natural Language Inference

Authors: Yonghao Liu, Mengyu Li, Di Liang, Ximing Li, Fausto Giunchiglia, Lan Huang, Xiaoyue Feng, Renchu Guan

Abstract: Natural Language Inference (NLI) is a crucial task in natural language processing that involves determining the relationship between two sentences, typically referred to as the premise and the hypothesis. However, traditional NLI models solely rely on the semantic information inherent in independent sentences and lack relevant situational visual information, which can hinder a complete understandi… ▽ More Natural Language Inference (NLI) is a crucial task in natural language processing that involves determining the relationship between two sentences, typically referred to as the premise and the hypothesis. However, traditional NLI models solely rely on the semantic information inherent in independent sentences and lack relevant situational visual information, which can hinder a complete understanding of the intended meaning of the sentences due to the ambiguity and vagueness of language. To address this challenge, we propose an innovative ScenaFuse adapter that simultaneously integrates large-scale pre-trained linguistic knowledge and relevant visual information for NLI tasks. Specifically, we first design an image-sentence interaction module to incorporate visuals into the attention mechanism of the pre-trained model, allowing the two modalities to interact comprehensively. Furthermore, we introduce an image-sentence fusion module that can adaptively integrate visual information from images and semantic information from sentences. By incorporating relevant visual information and leveraging linguistic knowledge, our approach bridges the gap between language and vision, leading to improved understanding and inference capabilities in NLI tasks. Extensive benchmark experiments demonstrate that our proposed ScenaFuse, a scenario-guided approach, consistently boosts NLI performance. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: IJCAI24

arXiv:2405.11524 [pdf, other]

Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

Authors: Mengyu Li, Yonghao Liu, Fausto Giunchiglia, Xiaoyue Feng, Renchu Guan

Abstract: Text classification is a crucial and fundamental task in natural language processing. Compared with the previous learning paradigm of pre-training and fine-tuning by cross entropy loss, the recently proposed supervised contrastive learning approach has received tremendous attention due to its powerful feature learning capability and robustness. Although several studies have incorporated this techn… ▽ More Text classification is a crucial and fundamental task in natural language processing. Compared with the previous learning paradigm of pre-training and fine-tuning by cross entropy loss, the recently proposed supervised contrastive learning approach has received tremendous attention due to its powerful feature learning capability and robustness. Although several studies have incorporated this technique for text classification, some limitations remain. First, many text datasets are imbalanced, and the learning mechanism of supervised contrastive learning is sensitive to data imbalance, which may harm the model performance. Moreover, these models leverage separate classification branch with cross entropy and supervised contrastive learning branch without explicit mutual guidance. To this end, we propose a novel model named SharpReCL for imbalanced text classification tasks. First, we obtain the prototype vector of each class in the balanced classification branch to act as a representation of each class. Then, by further explicitly leveraging the prototype vectors, we construct a proper and sufficient target sample set with the same size for each class to perform the supervised contrastive learning procedure. The empirical results show the effectiveness of our model, which even outperforms popular large language models across several datasets. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: 12 pages, 9 figures

arXiv:2405.03117 [pdf, other]

Galaxies with Biconical Ionized Structure in MaNGA - I. Sample Selection and Driven Mechanisms

Authors: Zhi-Jie Zhou, Yan-Mei Chen, Run-Quan Guan, Yong Shi, Qiu-Sheng Gu, Dmitry Bizyaev

Abstract: Based on the integral field unit (IFU) data from Map** Nearby Galaxies at Apache Point Observatory (MaNGA) survey, we develop a new method to select galaxies with biconical ionized structures, building a sample of 142 edge-on biconical ionized galaxies. We classify these 142 galaxies into 81 star-forming galaxies, 31 composite galaxies, and 30 AGNs (consisting of 23 Seyferts and 7 LI(N)ERs) acco… ▽ More Based on the integral field unit (IFU) data from Map** Nearby Galaxies at Apache Point Observatory (MaNGA) survey, we develop a new method to select galaxies with biconical ionized structures, building a sample of 142 edge-on biconical ionized galaxies. We classify these 142 galaxies into 81 star-forming galaxies, 31 composite galaxies, and 30 AGNs (consisting of 23 Seyferts and 7 LI(N)ERs) according to the {\nii}-BPT diagram. The star-forming bicones have bar-like structures while AGN bicones display hourglass structures, and composite bicones exhibit transitional morphologies between them due to both black hole and star-formation activities. Star-forming bicones have intense star-formation activities in their central regions, and the primary driver of biconical structures is the central star formation rate surface density. The lack of difference in the strength of central black hole activities (traced by dust attenuation corrected {\oiii}$λ$5007 luminosity and Eddington ratio) between Seyfert bicones and their control samples can be naturally explained as that the accretion disk and the galactic disk are not necessarily coplanar. Additionally, the biconical galaxies with central LI(N)ER-like line ratios are edge-on disk galaxies that show strong central dust attenuation. The radial gradients of {\ha} surface brightness follow the $r^{-2.35}$ relation, roughly consistent with $r^{-2}$ profile, which is expected in the case of photoionization by a central point-like source. These observations indicate obscured AGNs or AGN echoes as the primary drivers of biconical structures in LI(N)ERs. △ Less

Submitted 5 May, 2024; originally announced May 2024.

Comments: 12 pages, 9 figures, 1 table, Accepted for publication in MNRAS

arXiv:2404.18480 [pdf, ps, other]

Asymptotic stability of composite waves of viscous shock and rarefaction for relaxed compressible Navier-Stokes equations

Authors: Renyong guan, Yuxi Hu

Abstract: The time asymptotic stability for one-dimensional relaxed compressible Navier-Stokes equations is studied. We show that the composite waves of viscous shock and rarefaction are asymptotically nonlinear stable with both small wave strength and small initial perturbations. Moreover, as the relaxation parameter goes to zero, the solutions of relaxed system are shown to converge globally in time to th… ▽ More The time asymptotic stability for one-dimensional relaxed compressible Navier-Stokes equations is studied. We show that the composite waves of viscous shock and rarefaction are asymptotically nonlinear stable with both small wave strength and small initial perturbations. Moreover, as the relaxation parameter goes to zero, the solutions of relaxed system are shown to converge globally in time to that of classical system. The methods are based on relative entropy, the a-contraction with shifts theory and basic energy estimates. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.10342 [pdf, other]

Referring Flexible Image Restoration

Authors: Runwei Guan, Rongsheng Hu, Zhuhao Zhou, Tianlang Xue, Ka Lok Man, Jeremy Smith, Eng Gee Lim, Wei** Ding, Yutao Yue

Abstract: In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image… ▽ More In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised attention modules, Multi-Head Agent Self-Attention (MHASA) and Multi-Head Agent Cross Attention (MHACA), where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtaining competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective architecture for image restoration. We release our project at https://github.com/GuanRunwei/FIR-CP. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 15 pages, 19 figures

arXiv:2404.09790 [pdf, other]

NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

Authors: Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, **hua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou , et al. (63 additional authors not shown)

Abstract: This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained. The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs using prior information. The LR images originate from bicubic downsampling degradation. The aim of the challenge i… ▽ More This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained. The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs using prior information. The LR images originate from bicubic downsampling degradation. The aim of the challenge is to obtain designs/solutions with the most advanced SR performance, with no constraints on computational resources (e.g., model size and FLOPs) or training data. The track of this challenge assesses performance with the PSNR metric on the DIV2K testing dataset. The competition attracted 199 registrants, with 20 teams submitting valid entries. This collective endeavour not only pushes the boundaries of performance in single-image SR but also offers a comprehensive overview of current trends in this field. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: NTIRE 2024 webpage: https://cvlai.net/ntire/2024. Code: https://github.com/zhengchen1999/NTIRE2024_ImageSR_x4

arXiv:2404.05211 [pdf, other]

Multi-level Graph Subspace Contrastive Learning for Hyperspectral Image Clustering

Authors: **gxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, Chang Tang

Abstract: Hyperspectral image (HSI) clustering is a challenging task due to its high complexity. Despite subspace clustering shows impressive performance for HSI, traditional methods tend to ignore the global-local interaction in HSI data. In this study, we proposed a multi-level graph subspace contrastive learning (MLGSC) for HSI clustering. The model is divided into the following main parts. Graph convolu… ▽ More Hyperspectral image (HSI) clustering is a challenging task due to its high complexity. Despite subspace clustering shows impressive performance for HSI, traditional methods tend to ignore the global-local interaction in HSI data. In this study, we proposed a multi-level graph subspace contrastive learning (MLGSC) for HSI clustering. The model is divided into the following main parts. Graph convolution subspace construction: utilizing spectral and texture feautures to construct two graph convolution views. Local-global graph representation: local graph representations were obtained by step-by-step convolutions and a more representative global graph representation was obtained using an attention-based pooling strategy. Multi-level graph subspace contrastive learning: multi-level contrastive learning was conducted to obtain local-global joint graph representations, to improve the consistency of the positive samples between views, and to obtain more robust graph embeddings. Specifically, graph-level contrastive learning is used to better learn global representations of HSI data. Node-level intra-view and inter-view contrastive learning is designed to learn joint representations of local regions of HSI. The proposed model is evaluated on four popular HSI datasets: Indian Pines, Pavia University, Houston, and Xu Zhou. The overall accuracies are 97.75%, 99.96%, 92.28%, and 95.73%, which significantly outperforms the current state-of-the-art clustering methods. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: IJCNN 2024

arXiv:2404.00964 [pdf, other]

S2RC-GCN: A Spatial-Spectral Reliable Contrastive Graph Convolutional Network for Complex Land Cover Classification Using Hyperspectral Images

Authors: Renxiang Guan, Zihao Li, Chujia Song, Guo Yu, Xianju Li, Ruyi Feng

Abstract: Spatial correlations between different ground objects are an important feature of mining land cover research. Graph Convolutional Networks (GCNs) can effectively capture such spatial feature representations and have demonstrated promising results in performing hyperspectral imagery (HSI) classification tasks of complex land. However, the existing GCN-based HSI classification methods are prone to i… ▽ More Spatial correlations between different ground objects are an important feature of mining land cover research. Graph Convolutional Networks (GCNs) can effectively capture such spatial feature representations and have demonstrated promising results in performing hyperspectral imagery (HSI) classification tasks of complex land. However, the existing GCN-based HSI classification methods are prone to interference from redundant information when extracting complex features. To classify complex scenes more effectively, this study proposes a novel spatial-spectral reliable contrastive graph convolutional classification framework named S2RC-GCN. Specifically, we fused the spectral and spatial features extracted by the 1D- and 2D-encoder, and the 2D-encoder includes an attention model to automatically extract important information. We then leveraged the fused high-level features to construct graphs and fed the resulting graphs into the GCNs to determine more effective graph representations. Furthermore, a novel reliable contrastive graph convolution was proposed for reliable contrastive learning to learn and fuse robust features. Finally, to test the performance of the model on complex object classification, we used imagery taken by Gaofen-5 in the Jiang Xia area to construct complex land cover datasets. The test results show that compared with other models, our model achieved the best results and effectively improved the classification performance of complex remote sensing imagery. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)

arXiv:2403.12686 [pdf, other]

WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar

Authors: Runwei Guan, Liye Jia, Fengyufan Yang, Shanliang Yao, Erick Purwanto, Xiaohui Zhu, Eng Gee Lim, Jeremy Smith, Ka Lok Man, Xuming Hu, Yutao Yue

Abstract: The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the… ▽ More The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the instance level including bounding boxes and masks. Notably, WaterVG includes 11,568 samples with 34,987 referred targets, whose prompts integrates both visual and radar characteristics. The pattern of text-guided two sensors equips a finer granularity of text prompts with visual and radar features of referred targets. Moreover, we propose a low-power visual grounding model, Potamoi, which is a multi-task model with a well-designed Phased Heterogeneous Modality Fusion (PHMF) mode, including Adaptive Radar Weighting (ARW) and Multi-Head Slim Cross Attention (MHSCA). Exactly, ARW extracts required radar features to fuse with vision for prompt alignment. MHSCA is an efficient fusion module with a remarkably small parameter count and FLOPs, elegantly fusing scenario context captured by two sensors with linguistic features, which performs expressively on visual grounding tasks. Comprehensive experiments and evaluations have been conducted on WaterVG, where our Potamoi archives state-of-the-art performances compared with counterparts. △ Less

Submitted 4 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: 10 pages, 10 figures

arXiv:2403.01465 [pdf]

Multiview Subspace Clustering of Hyperspectral Images based on Graph Convolutional Networks

Authors: Xianju Li, Renxiang Guan, Zihao Li, Hao Liu, **g Yang

Abstract: High-dimensional and complex spectral structures make clustering of hy-perspectral images (HSI) a challenging task. Subspace clustering has been shown to be an effective approach for addressing this problem. However, current subspace clustering algorithms are mainly designed for a single view and do not fully exploit spatial or texture feature information in HSI. This study proposed a multiview su… ▽ More High-dimensional and complex spectral structures make clustering of hy-perspectral images (HSI) a challenging task. Subspace clustering has been shown to be an effective approach for addressing this problem. However, current subspace clustering algorithms are mainly designed for a single view and do not fully exploit spatial or texture feature information in HSI. This study proposed a multiview subspace clustering of HSI based on graph convolutional networks. (1) This paper uses the powerful classification ability of graph convolutional network and the learning ability of topologi-cal relationships between nodes to analyze and express the spatial relation-ship of HSI. (2) Pixel texture and pixel neighbor spatial-spectral infor-mation were sent to construct two graph convolutional subspaces. (3) An attention-based fusion module was used to adaptively construct a more discriminative feature map. The model was evaluated on three popular HSI datasets, including Indian Pines, Pavia University, and Houston. It achieved overall accuracies of 92.38%, 93.43%, and 83.82%, respectively and significantly outperformed the state-of-the-art clustering methods. In conclusion, the proposed model can effectively improve the clustering ac-curacy of HSI. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: This paper was accepted by APWEB-WAIM 2024

arXiv:2312.09630 [pdf, other]

Pixel-Superpixel Contrastive Learning and Pseudo-Label Correction for Hyperspectral Image Clustering

Authors: Renxiang Guan, Zihao Li, Xianju Li, Chang Tang

Abstract: Hyperspectral image (HSI) clustering is gaining considerable attention owing to recent methods that overcome the inefficiency and misleading results from the absence of supervised information. Contrastive learning methods excel at existing pixel level and super pixel level HSI clustering tasks. The pixel-level contrastive learning method can effectively improve the ability of the model to capture… ▽ More Hyperspectral image (HSI) clustering is gaining considerable attention owing to recent methods that overcome the inefficiency and misleading results from the absence of supervised information. Contrastive learning methods excel at existing pixel level and super pixel level HSI clustering tasks. The pixel-level contrastive learning method can effectively improve the ability of the model to capture fine features of HSI but requires a large time overhead. The super pixel-level contrastive learning method utilizes the homogeneity of HSI and reduces computing resources; however, it yields rough classification results. To exploit the strengths of both methods, we present a pixel super pixel contrastive learning and pseudo-label correction (PSCPC) method for the HSI clustering. PSCPC can reasonably capture domain-specific and fine-grained features through super pixels and the comparative learning of a small number of pixels within the super pixels. To improve the clustering performance of super pixels, this paper proposes a pseudo-label correction module that aligns the clustering pseudo-labels of pixels and super-pixels. In addition, pixel-level clustering results are used to supervise super pixel-level clustering, improving the generalization ability of the model. Extensive experiments demonstrate the effectiveness and efficiency of PSCPC. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Accepted at IEEE ICASSP 2024

arXiv:2312.08851 [pdf, other]

Achelous++: Power-Oriented Water-Surface Panoptic Perception Framework on Edge Devices based on Vision-Radar Fusion and Pruning of Heterogeneous Modalities

Authors: Runwei Guan, Haocheng Zhao, Shanliang Yao, Ka Lok Man, Xiaohui Zhu, Limin Yu, Yong Yue, Jeremy Smith, Eng Gee Lim, Wei** Ding, Yutao Yue

Abstract: Urban water-surface robust perception serves as the foundation for intelligent monitoring of aquatic environments and the autonomous navigation and operation of unmanned vessels, especially in the context of waterway safety. It is worth noting that current multi-sensor fusion and multi-task learning models consume substantial power and heavily rely on high-power GPUs for inference. This contribute… ▽ More Urban water-surface robust perception serves as the foundation for intelligent monitoring of aquatic environments and the autonomous navigation and operation of unmanned vessels, especially in the context of waterway safety. It is worth noting that current multi-sensor fusion and multi-task learning models consume substantial power and heavily rely on high-power GPUs for inference. This contributes to increased carbon emissions, a concern that runs counter to the prevailing emphasis on environmental preservation and the pursuit of sustainable, low-carbon urban environments. In light of these concerns, this paper concentrates on low-power, lightweight, multi-task panoptic perception through the fusion of visual and 4D radar data, which is seen as a promising low-cost perception method. We propose a framework named Achelous++ that facilitates the development and comprehensive evaluation of multi-task water-surface panoptic perception models. Achelous++ can simultaneously execute five perception tasks with high speed and low power consumption, including object detection, object semantic segmentation, drivable-area segmentation, waterline segmentation, and radar point cloud semantic segmentation. Furthermore, to meet the demand for developers to customize models for real-time inference on low-performance devices, a novel multi-modal pruning strategy known as Heterogeneous-Aware SynFlow (HA-SynFlow) is proposed. Besides, Achelous++ also supports random pruning at initialization with different layer-wise sparsity, such as Uniform and Erdos-Renyi-Kernel (ERK). Overall, our Achelous++ framework achieves state-of-the-art performance on the WaterScenes benchmark, excelling in both accuracy and power efficiency compared to other single-task and multi-task models. We release and maintain the code at https://github.com/GuanRunwei/Achelous. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 18 pages, 9 figures

arXiv:2312.06068 [pdf, other]

Contrastive Multi-view Subspace Clustering of Hyperspectral Images based on Graph Convolutional Networks

Authors: Renxiang Guan, Zihao Li, Xianju Li, Chang Tang, Ruyi Feng

Abstract: High-dimensional and complex spectral structures make the clustering of hyperspectral images (HSI) a challenging task. Subspace clustering is an effective approach for addressing this problem. However, current subspace clustering algorithms are primarily designed for a single view and do not fully exploit the spatial or textural feature information in HSI. In this study, contrastive multi-view sub… ▽ More High-dimensional and complex spectral structures make the clustering of hyperspectral images (HSI) a challenging task. Subspace clustering is an effective approach for addressing this problem. However, current subspace clustering algorithms are primarily designed for a single view and do not fully exploit the spatial or textural feature information in HSI. In this study, contrastive multi-view subspace clustering of HSI was proposed based on graph convolutional networks. Pixel neighbor textural and spatial-spectral information were sent to construct two graph convolutional subspaces to learn their affinity matrices. To maximize the interaction between different views, a contrastive learning algorithm was introduced to promote the consistency of positive samples and assist the model in extracting robust features. An attention-based fusion module was used to adaptively integrate these affinity matrices, constructing a more discriminative affinity matrix. The model was evaluated using four popular HSI datasets: Indian Pines, Pavia University, Houston, and Xu Zhou. It achieved overall accuracies of 97.61%, 96.69%, 87.21%, and 97.65%, respectively, and significantly outperformed state-of-the-art clustering methods. In conclusion, the proposed model effectively improves the clustering accuracy of HSI. △ Less

Submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.04861 [pdf, other]

Exploring Radar Data Representations in Autonomous Driving: A Comprehensive Review

Authors: Shanliang Yao, Runwei Guan, Zitian Peng, Chenhang Xu, Yilu Shi, Wei** Ding, Eng Gee Lim, Yong Yue, Hyungjoon Seo, Ka Lok Man, Jieming Ma, Xiaohui Zhu, Yutao Yue

Abstract: With the rapid advancements of sensor technology and deep learning, autonomous driving systems are providing safe and efficient access to intelligent vehicles as well as intelligent transportation. Among these equipped sensors, the radar sensor plays a crucial role in providing robust perception information in diverse environmental conditions. This review focuses on exploring different radar data… ▽ More With the rapid advancements of sensor technology and deep learning, autonomous driving systems are providing safe and efficient access to intelligent vehicles as well as intelligent transportation. Among these equipped sensors, the radar sensor plays a crucial role in providing robust perception information in diverse environmental conditions. This review focuses on exploring different radar data representations utilized in autonomous driving systems. Firstly, we introduce the capabilities and limitations of the radar sensor by examining the working principles of radar perception and signal processing of radar measurements. Then, we delve into the generation process of five radar representations, including the ADC signal, radar tensor, point cloud, grid map, and micro-Doppler signature. For each radar representation, we examine the related datasets, methods, advantages and limitations. Furthermore, we discuss the challenges faced in these data representations and propose potential research directions. Above all, this comprehensive review offers an in-depth insight into how these representations enhance autonomous system capabilities, providing guidance for radar perception researchers. To facilitate retrieval and comparison of different data representations, datasets and methods, we provide an interactive website at https://radar-camera-fusion.github.io/radar. △ Less

Submitted 19 April, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

Comments: 24 pages, 10 figures, 5 tables. arXiv admin note: text overlap with arXiv:2304.10410

arXiv:2308.10287 [pdf, other]

ASY-VRNet: Waterway Panoptic Driving Perception Model based on Asymmetric Fair Fusion of Vision and 4D mmWave Radar

Authors: Runwei Guan, Shanliang Yao, Xiaohui Zhu, Ka Lok Man, Yong Yue, Jeremy Smith, Eng Gee Lim, Yutao Yue

Abstract: Panoptic Driving Perception (PDP) is critical for the autonomous navigation of Unmanned Surface Vehicles (USVs). A PDP model typically integrates multiple tasks, necessitating the simultaneous and robust execution of various perception tasks to facilitate downstream path planning. The fusion of visual and radar sensors is currently acknowledged as a robust and cost-effective approach. However, mos… ▽ More Panoptic Driving Perception (PDP) is critical for the autonomous navigation of Unmanned Surface Vehicles (USVs). A PDP model typically integrates multiple tasks, necessitating the simultaneous and robust execution of various perception tasks to facilitate downstream path planning. The fusion of visual and radar sensors is currently acknowledged as a robust and cost-effective approach. However, most existing research has primarily focused on fusing visual and radar features dedicated to object detection or utilizing a shared feature space for multiple tasks, neglecting the individual representation differences between various tasks. To address this gap, we propose a pair of Asymmetric Fair Fusion (AFF) modules with favorable explainability designed to efficiently interact with independent features from both visual and radar modalities, tailored to the specific requirements of object detection and semantic segmentation tasks. The AFF modules treat image and radar maps as irregular point sets and transform these features into a crossed-shared feature space for multitasking, ensuring equitable treatment of vision and radar point cloud features. Leveraging AFF modules, we propose a novel and efficient PDP model, ASY-VRNet, which processes image and radar features based on irregular super-pixel point sets. Additionally, we propose an effective multitask learning method specifically designed for PDP models. Compared to other lightweight models, ASY-VRNet achieves state-of-the-art performance in object detection, semantic segmentation, and drivable-area segmentation on the WaterScenes benchmark. Our project is publicly available at https://github.com/GuanRunwei/ASY-VRNet. △ Less

Submitted 4 July, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

Comments: Accepted by IROS 2024

arXiv:2307.09063 [pdf, other]

Radar-STDA: A High-Performance Spatial-Temporal Denoising Autoencoder for Interference Mitigation of FMCW Radars

Authors: Lulu Liu, Runwei Guan, Fei Ma, Jeremy Smith, Yutao Yue

Abstract: With its small size, low cost and all-weather operation, millimeter-wave radar can accurately measure the distance, azimuth and radial velocity of a target compared to other traffic sensors. However, in practice, millimeter-wave radars are plagued by various interferences, leading to a drop in target detection accuracy or even failure to detect targets. This is undesirable in autonomous vehicles a… ▽ More With its small size, low cost and all-weather operation, millimeter-wave radar can accurately measure the distance, azimuth and radial velocity of a target compared to other traffic sensors. However, in practice, millimeter-wave radars are plagued by various interferences, leading to a drop in target detection accuracy or even failure to detect targets. This is undesirable in autonomous vehicles and traffic surveillance, as it is likely to threaten human life and cause property damage. Therefore, interference mitigation is of great significance for millimeter-wave radar-based target detection. Currently, the development of deep learning is rapid, but existing deep learning-based interference mitigation models still have great limitations in terms of model size and inference speed. For these reasons, we propose Radar-STDA, a Radar-Spatial Temporal Denoising Autoencoder. Radar-STDA is an efficient nano-level denoising autoencoder that takes into account both spatial and temporal information of range-Doppler maps. Among other methods, it achieves a maximum SINR of 17.08 dB with only 140,000 parameters. It obtains 207.6 FPS on an RTX A4000 GPU and 56.8 FPS on an NVIDIA Jetson AGXXavier respectively when denoising range-Doppler maps for three consecutive frames. Moreover, we release a synthetic data set called Ra-inf for the task, which involves 384,769 range-Doppler maps with various clutters from objects of no interest and receiver noise in realistic scenarios. To the best of our knowledge, Ra-inf is the first synthetic dataset of radar interference. To support the community, our research is open-source via the link \url{https://github.com/GuanRunwei/rd_map_temporal_spatial_denoising_autoencoder}. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.07102 [pdf, other]

Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar

Authors: Runwei Guan, Shanliang Yao, Xiaohui Zhu, Ka Lok Man, Eng Gee Lim, Jeremy Smith, Yong Yue, Yutao Yue

Abstract: Current perception models for different tasks usually exist in modular forms on Unmanned Surface Vehicles (USVs), which infer extremely slowly in parallel on edge devices, causing the asynchrony between perception results and USV position, and leading to error decisions of autonomous navigation. Compared with Unmanned Ground Vehicles (UGVs), the robust perception of USVs develops relatively slowly… ▽ More Current perception models for different tasks usually exist in modular forms on Unmanned Surface Vehicles (USVs), which infer extremely slowly in parallel on edge devices, causing the asynchrony between perception results and USV position, and leading to error decisions of autonomous navigation. Compared with Unmanned Ground Vehicles (UGVs), the robust perception of USVs develops relatively slowly. Moreover, most current multi-task perception models are huge in parameters, slow in inference and not scalable. Oriented on this, we propose Achelous, a low-cost and fast unified panoptic perception framework for water-surface perception based on the fusion of a monocular camera and 4D mmWave radar. Achelous can simultaneously perform five tasks, detection and segmentation of visual targets, drivable-area segmentation, waterline segmentation and radar point cloud segmentation. Besides, models in Achelous family, with less than around 5 million parameters, achieve about 18 FPS on an NVIDIA Jetson AGX Xavier, 11 FPS faster than HybridNets, and exceed YOLOX-Tiny and Segformer-B0 on our collected dataset about 5 mAP$_{\text{50-95}}$ and 0.7 mIoU, especially under situations of adverse weather, dark environments and camera failure. To our knowledge, Achelous is the first comprehensive panoptic perception framework combining vision-level and point-cloud-level tasks for water-surface perception. To promote the development of the intelligent transportation community, we release our codes in \url{https://github.com/GuanRunwei/Achelous}. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: Accepted by ITSC 2023

arXiv:2307.06505 [pdf, other]

WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmarks for Autonomous Driving on Water Surfaces

Authors: Shanliang Yao, Runwei Guan, Zhaodong Wu, Yi Ni, Zile Huang, Ryan Wen Liu, Yong Yue, Wei** Ding, Eng Gee Lim, Hyungjoon Seo, Ka Lok Man, Jieming Ma, Xiaohui Zhu, Yutao Yue

Abstract: Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography map** and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camer… ▽ More Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography map** and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camera, our Unmanned Surface Vehicle (USV) proffers all-weather solutions for discerning object-related information, including color, shape, texture, range, velocity, azimuth, and elevation. Focusing on typical static and dynamic objects on water surfaces, we label the camera images and radar point clouds at pixel-level and point-level, respectively. In addition to basic perception tasks, such as object detection, instance segmentation and semantic segmentation, we also provide annotations for free-space segmentation and waterline segmentation. Leveraging the multi-task and multi-modal data, we conduct benchmark experiments on the uni-modality of radar and camera, as well as the fused modalities. Experimental results demonstrate that 4D radar-camera fusion can considerably improve the accuracy and robustness of perception on water surfaces, especially in adverse lighting and weather conditions. WaterScenes dataset is public on https://waterscenes.github.io. △ Less

Submitted 15 June, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: Accepted by IEEE Transactions on Intelligent Transportation Systems

arXiv:2305.19310 [pdf, other]

doi 10.1093/mnras/stae925

The Merian Survey: Design, Construction, and Characterization of a Filter Set Optimized to Find Dwarf Galaxies and Measure their Dark Matter Halo Properties with Weak Lensing

Authors: Yifei Luo, Alexie Leauthaud, Jenny Greene, Song Huang, Erin Kado-Fong, Shany Danieli, Ting S. Li, Jiaxuan Li, Diana Blanco, Erik J. Wasleske, Joseph Wick, Abby Mintz, Runquan Guan, Annika H. G. Peter, Vivienne Baldassare, Alyson Brooks, Arka Banerjee, Joy Bhattacharyya, Zheng Cai, Xinjun Chen, Jim Gunn, Sean D. Johnson, Lee S. Kelvin, Mingyu Li, Xiao**g Lin , et al. (6 additional authors not shown)

Abstract: The Merian survey is map** $\sim$ 850 degrees$^2$ of the Hyper Suprime-Cam Strategic Survey Program (HSC-SSP) wide layer with two medium-band filters on the 4-meter Victor M. Blanco telescope at the Cerro Tololo Inter-American Observatory, with the goal of carrying the first high signal-to-noise (S/N) measurements of weak gravitational lensing around dwarf galaxies. This paper presents the desig… ▽ More The Merian survey is map** $\sim$ 850 degrees$^2$ of the Hyper Suprime-Cam Strategic Survey Program (HSC-SSP) wide layer with two medium-band filters on the 4-meter Victor M. Blanco telescope at the Cerro Tololo Inter-American Observatory, with the goal of carrying the first high signal-to-noise (S/N) measurements of weak gravitational lensing around dwarf galaxies. This paper presents the design of the Merian filter set: N708 ($λ_c = 7080 \unicode{x212B}$, $Δλ= 275\unicode{x212B}$) and N540 ($λ_c = 5400\unicode{x212B}$, $Δλ= 210\unicode{x212B}$). The central wavelengths and filter widths of N708 and N540 were designed to detect the $\rm Hα$ and $\rm [OIII]$ emission lines of galaxies in the mass range $8<\rm \log M_*/M_\odot<9$ by comparing Merian fluxes with HSC broad-band fluxes. Our filter design takes into account the weak lensing S/N and photometric redshift performance. Our simulations predict that Merian will yield a sample of $\sim$ 85,000 star-forming dwarf galaxies with a photometric redshift accuracy of $σ_{Δz/(1+z)}\sim 0.01$ and an outlier fraction of $η=2.8\%$ over the redshift range $0.058<z<0.10$. With 60 full nights on the Blanco/Dark Energy Camera (DECam), the Merian survey is predicted to measure the average weak lensing profile around dwarf galaxies with lensing $\rm S/N \sim 32$ within $r<0.5$ Mpc and lensing $\rm S/N \sim 90$ within $r<1.0$ Mpc. This unprecedented sample of star-forming dwarf galaxies will allow for studies of the interplay between dark matter and stellar feedback and their roles in the evolution of dwarf galaxies. △ Less

Submitted 3 April, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: 18 pages, 14 figures, accepted for publication in MNRAS

arXiv:2304.10893 [pdf, other]

FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system

Authors: Runwei Guan, Ka Lok Man, Feifan Chen, Shanliang Yao, Rongsheng Hu, Xiaohui Zhu, Jeremy Smith, Eng Gee Lim, Yutao Yue

Abstract: Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting both text and image features and map** them to t… ▽ More Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting both text and image features and map** them to the same latent space to compare the similarity. However, existing methods usually use dependency analysis or semantic role-labelling techniques to find keywords related to vehicle attributes. These techniques may require a lot of pre-processing and post-processing work, and also suffer from extracting the wrong keyword when the NL query is complex. To tackle these problems and simplify, we borrow the idea from named entity recognition (NER) and construct FindVehicle, a NER dataset in the traffic domain. It has 42.3k labelled NL descriptions of vehicle tracks, containing information such as the location, orientation, type and colour of the vehicle. FindVehicle also adopts both overlap** entities and fine-grained entities to meet further requirements. To verify its effectiveness, we propose a baseline NL-based vehicle retrieval model called VehicleFinder. Our experiment shows that by using text encoders pre-trained by FindVehicle, VehicleFinder achieves 87.7\% precision and 89.4\% recall when retrieving a target vehicle by text command on our homemade dataset based on UA-DETRAC. The time cost of VehicleFinder is 279.35 ms on one ARM v8.2 CPU and 93.72 ms on one RTX A4000 GPU, which is much faster than the Transformer-based system. The dataset is open-source via the link https://github.com/GuanRunwei/FindVehicle, and the implementation can be found via the link https://github.com/GuanRunwei/VehicleFinder-CTIM. △ Less

Submitted 21 April, 2023; originally announced April 2023.

arXiv:2304.10410 [pdf, other]

doi 10.1109/TIV.2023.3307157

Radar-Camera Fusion for Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review

Authors: Shanliang Yao, Runwei Guan, Xiaoyu Huang, Zhuoxiao Li, Xiangyu Sha, Yong Yue, Eng Gee Lim, Hyungjoon Seo, Ka Lok Man, Xiaohui Zhu, Yutao Yue

Abstract: Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years, enabling vehicles to accurately detect and interpret surrounding environment for safe and efficient navigation. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, making sensor fusion a crucial part of the percepti… ▽ More Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years, enabling vehicles to accurately detect and interpret surrounding environment for safe and efficient navigation. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, making sensor fusion a crucial part of the perception system. Among these fused sensors, radars and cameras enable a complementary and cost-effective perception of the surrounding environment regardless of lighting and weather conditions. This review aims to provide a comprehensive guideline for radar-camera fusion, particularly concentrating on perception tasks related to object detection and semantic segmentation.Based on the principles of the radar and camera sensors, we delve into the data processing process and representations, followed by an in-depth analysis and summary of radar-camera fusion datasets. In the review of methodologies in radar-camera fusion, we address interrogative questions, including "why to fuse", "what to fuse", "where to fuse", "when to fuse", and "how to fuse", subsequently discussing various challenges and potential research directions within this domain. To ease the retrieval and comparison of datasets and fusion methods, we also provide an interactive website: https://radar-camera-fusion.github.io. △ Less

Submitted 23 August, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

Comments: Accepted by IEEE Transactions on Intelligent Vehicles (T-IV)

Journal ref: IEEE Transactions on Intelligent Vehicles 2023

arXiv:2212.01586 [pdf, ps, other]

Transversality of the perturbed reduced Vafa-Witten moduli spaces on 4-manifolds

Authors: Ren Guan

Abstract: Previously we finish the establishment of the transversality of the general part of the Vafa-Witten moduli spaces, in this paper, we deal with the rest, i.e., the reduced part. We consider Vafa-Witten equation on closed, oriented and smooth Riemann 4-manifolds with $C\equiv0$, and construct perturbation to establish the transversality of the perturbed equation. Then we show that for a generic choi… ▽ More Previously we finish the establishment of the transversality of the general part of the Vafa-Witten moduli spaces, in this paper, we deal with the rest, i.e., the reduced part. We consider Vafa-Witten equation on closed, oriented and smooth Riemann 4-manifolds with $C\equiv0$, and construct perturbation to establish the transversality of the perturbed equation. Then we show that for a generic choice of the perturbation terms, the moduli space of solutions to the perturbed reduced Vafa-Witten equation for the structure group $SU(2)$ or $SO(3)$ on a closed 4-manifold is a smooth manifold of dimension zero. △ Less

Submitted 3 December, 2022; originally announced December 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2207.03701

arXiv:2210.02310 [pdf, ps, other]

$K_0$ groups of Connes' $Θ$-deformed $m$-planes

Authors: Ren Guan

Abstract: We show that the $K_0$ groups of Connes' $Θ$-deformed $m$-planes and their smooth versions are all $\mathbb{Z}$. We show that the $K_0$ groups of Connes' $Θ$-deformed $m$-planes and their smooth versions are all $\mathbb{Z}$. △ Less

Submitted 5 October, 2022; originally announced October 2022.

arXiv:2208.06253 [pdf, ps, other]

$K_0$ groups of noncommutative $\mathbb{R}^{2n}$

Authors: Ren Guan

Abstract: In this paper we show that the $K_0$ groups of noncommutative $\mathbb{R}^{2n}$ are $\mathbb{Z}$ for $\forall n\in\mathbb{N}^*$ and make an approach to the calculation of the smooth case, which will bring many new sequence problems relating to binomial numbers. In this paper we show that the $K_0$ groups of noncommutative $\mathbb{R}^{2n}$ are $\mathbb{Z}$ for $\forall n\in\mathbb{N}^*$ and make an approach to the calculation of the smooth case, which will bring many new sequence problems relating to binomial numbers. △ Less

Submitted 29 October, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

arXiv:2208.06131 [pdf, ps, other]

A variation of the reduced Vafa-Witten equations on 4-manifolds

Authors: Ren Guan

Abstract: In this paper we consider a variation of the Vafa-Witten equations on compact, oriented and smooth 4-manifolds, and construct a set of perturbation terms to establish the transversality of that equations. The new perturbed equations provide us a priori estimates of the solutions, while the original reduced Vafa-Witten equations does not. By applying the a priori estimates we show that the singular… ▽ More In this paper we consider a variation of the Vafa-Witten equations on compact, oriented and smooth 4-manifolds, and construct a set of perturbation terms to establish the transversality of that equations. The new perturbed equations provide us a priori estimates of the solutions, while the original reduced Vafa-Witten equations does not. By applying the a priori estimates we show that the singular of the solutions can be removed, and then construct the Ulhenbeck closure of the moduli spaces. △ Less

Submitted 12 August, 2022; originally announced August 2022.

Comments: arXiv admin note: text overlap with arXiv:2207.03701

arXiv:2208.03668 [pdf, ps, other]

doi 10.1051/0004-6361/202243805

Interpreting time-integrated polarization data of gamma-ray burst prompt emission

Authors: R. Y. Guan, M. X. Lan

Abstract: Aims. With the accumulation of polarization data in the gamma-ray burst (GRB) prompt phase, polarization models can be tested. Methods. We predicted the time-integrated polarizations of 37 GRBs with polarization observation. We used their observed spectral parameters to do this. In the model, the emission mechanism is synchrotron radiation, and the magnetic field configuration in the emission regi… ▽ More Aims. With the accumulation of polarization data in the gamma-ray burst (GRB) prompt phase, polarization models can be tested. Methods. We predicted the time-integrated polarizations of 37 GRBs with polarization observation. We used their observed spectral parameters to do this. In the model, the emission mechanism is synchrotron radiation, and the magnetic field configuration in the emission region was assumed to be large-scale ordered. Therefore, the predicted polarization degrees (PDs) are upper limits. Results. For most GRBs detected by the Gamma-ray Burst Polarimeter (GAP), POLAR, and AstroSat, the predicted PD can match the corresponding observed PD. Hence the synchrotron-emission model in a large-scale ordered magnetic field can interpret both the moderately low PDs ($\sim10\%$) detected by POLAR and relatively high PDs ($\sim45\%$) observed by GAP and AstroSat well. Therefore, the magnetic fields in these GRB prompt phases or at least during the peak times are dominated by the ordered component. However, the predicted PDs of GRB 110721A observed by GAP and GRB 180427A observed by AstroSat are both lower than the observed values. Because the synchrotron emission in an ordered magnetic field predicts the upper-limit of the PD for the synchrotron-emission models, PD observations of the two bursts challenge the synchrotron-emission model. Then we predict the PDs of the High-energy Polarimetry Detector (HPD) and Low-energy Polarimetry Detector (LPD) on board the upcoming POLAR-2. In the synchrotron-emission models, the concentrated PD values of the GRBs detected by HPD will be higher than the LPD, which might be different from the predictions of the dissipative photosphere model. Therefore, more accurate multiband polarization observations are highly desired to test models of the GRB prompt phase. △ Less

Submitted 7 October, 2022; v1 submitted 7 August, 2022; originally announced August 2022.

Comments: 6 pages, 5 figures, with updated AstroSat data, accepted by AA

Journal ref: A&A 670, A160 (2023)

arXiv:2207.03701 [pdf, ps, other]

On the general part of the perturbed Vafa-Witten moduli spaces on 4-manifolds

Authors: Ren Guan

Abstract: In this paper we consider the Vafa-Witten equations on closed, oriented and smooth 4-manifolds, and construct a set of perturbation terms to establish the transversality of the perturbed Vafa-Witten equations at the general part of the solutions. Then we show that for a generic choice of the perturbation terms, this part of the moduli space for the structure group $SU(2)$ or $SO(3)$ on a closed 4-… ▽ More In this paper we consider the Vafa-Witten equations on closed, oriented and smooth 4-manifolds, and construct a set of perturbation terms to establish the transversality of the perturbed Vafa-Witten equations at the general part of the solutions. Then we show that for a generic choice of the perturbation terms, this part of the moduli space for the structure group $SU(2)$ or $SO(3)$ on a closed 4-manifold is a smooth manifold of dimension zero. △ Less

Submitted 8 November, 2022; v1 submitted 8 July, 2022; originally announced July 2022.

arXiv:2201.05722 [pdf, other]

Global stability of SIR model with heterogeneous transmission rate modeled by the Preisach operator

Authors: Ruofei Guan, Jana Kopfová, Dmitrii Rachinskii

Abstract: In recent years, classical epidemic models, which assume stationary behavior of individuals, have been extended to include an adaptive heterogeneous response of the population to the current state of the epidemic. However, it is widely accepted that human behavior can exhibit history-dependence as a consequence of learned experiences. This history-dependence is similar to hysteresis effects that h… ▽ More In recent years, classical epidemic models, which assume stationary behavior of individuals, have been extended to include an adaptive heterogeneous response of the population to the current state of the epidemic. However, it is widely accepted that human behavior can exhibit history-dependence as a consequence of learned experiences. This history-dependence is similar to hysteresis effects that have been well-studied in control theory. To illustrate the importance of history-dependence for epidemic theory, we study dynamics of a variant of the SIRS model where individuals exhibit lazy-switch responses to prevalence dynamics. The resulting model, which includes the Preisach hysteresis operator, possesses a continuum of endemic equilibrium states characterized by different proportions of susceptible, infected and recovered populations. We discuss stability properties of the endemic equilibrium set and relate them to the degree of heterogeneity of the adaptive response. Our results support the argument that public health responses during the emergence of a new disease can have long-term consequences for subsequent management efforts. The main mathematical contribution of this work is a method of global stability analysis, which uses a family of Lyapunov functions corresponding to different branches of the hysteresis operator. △ Less

Submitted 14 January, 2022; originally announced January 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2103.07989

MSC Class: 34D23; 92D30; 92D25

arXiv:2112.06104 [pdf, other]

doi 10.1145/3486635.3491070

Synthetic Map Generation to Provide Unlimited Training Data for Historical Map Text Detection

Authors: Zekun Li, Runyu Guan, Qianmu Yu, Yao-Yi Chiang, Craig A. Knoblock

Abstract: Many historical map sheets are publicly available for studies that require long-term historical geographic data. The cartographic design of these maps includes a combination of map symbols and text labels. Automatically reading text labels from map images could greatly speed up the map interpretation and helps generate rich metadata describing the map content. Many text detection algorithms have b… ▽ More Many historical map sheets are publicly available for studies that require long-term historical geographic data. The cartographic design of these maps includes a combination of map symbols and text labels. Automatically reading text labels from map images could greatly speed up the map interpretation and helps generate rich metadata describing the map content. Many text detection algorithms have been proposed to locate text regions in map images automatically, but most of the algorithms are trained on out-ofdomain datasets (e.g., scenic images). Training data determines the quality of machine learning models, and manually annotating text regions in map images is labor-extensive and time-consuming. On the other hand, existing geographic data sources, such as Open- StreetMap (OSM), contain machine-readable map layers, which allow us to separate out the text layer and obtain text label annotations easily. However, the cartographic styles between OSM map tiles and historical maps are significantly different. This paper proposes a method to automatically generate an unlimited amount of annotated historical map images for training text detection models. We use a style transfer model to convert contemporary map images into historical style and place text labels upon them. We show that the state-of-the-art text detection models (e.g., PSENet) can benefit from the synthetic historical maps and achieve significant improvement for historical map text detection. △ Less

Submitted 11 December, 2021; originally announced December 2021.

arXiv:2111.06454 [pdf, other]

Towards Transferring Human Preferences from Canonical to Actual Assembly Tasks

Authors: Heramb Nemlekar, Runyu Guan, Guanyang Luo, Satyandra K. Gupta, Stefanos Nikolaidis

Abstract: To assist human users according to their individual preference in assembly tasks, robots typically require user demonstrations in the given task. However, providing demonstrations in actual assembly tasks can be tedious and time-consuming. Our thesis is that we can learn user preferences in assembly tasks from demonstrations in a representative canonical task. Inspired by previous work in economy… ▽ More To assist human users according to their individual preference in assembly tasks, robots typically require user demonstrations in the given task. However, providing demonstrations in actual assembly tasks can be tedious and time-consuming. Our thesis is that we can learn user preferences in assembly tasks from demonstrations in a representative canonical task. Inspired by previous work in economy of human movement, we propose to represent user preferences as a linear function of abstract task-agnostic features, such as movement and physical and mental effort required by the user. For each user, we learn their preference from demonstrations in a canonical task and use the learned preference to anticipate their actions in the actual assembly task without any user demonstrations in the actual task. We evaluate our proposed method in a model-airplane assembly study and show that preferences can be effectively transferred from canonical to actual assembly tasks, enabling robots to anticipate user actions. △ Less

Submitted 24 June, 2022; v1 submitted 11 November, 2021; originally announced November 2021.

Comments: 7 pages, 8 figures, IEEE International Conference on Robot & Human Interactive Communication, Naples, Italy, Aug 2022

arXiv:2109.04360 [pdf, other]

doi 10.1109/IPIN51156.2021.9662577

Measuring Uncertainty in Signal Fingerprinting with Gaussian Processes Going Deep

Authors: Ran Guan, Andi Zhang, Mengchao Li, Yongliang Wang

Abstract: In indoor positioning, signal fluctuation is highly location-dependent. However, signal uncertainty is one critical yet commonly overlooked dimension of the radio signal to be fingerprinted. This paper reviews the commonly used Gaussian Processes (GP) for probabilistic positioning and points out the pitfall of using GP to model signal fingerprint uncertainty. This paper also proposes Deep Gaussian… ▽ More In indoor positioning, signal fluctuation is highly location-dependent. However, signal uncertainty is one critical yet commonly overlooked dimension of the radio signal to be fingerprinted. This paper reviews the commonly used Gaussian Processes (GP) for probabilistic positioning and points out the pitfall of using GP to model signal fingerprint uncertainty. This paper also proposes Deep Gaussian Processes (DGP) as a more informative alternative to address the issue. How DGP better measures uncertainty in signal fingerprinting is evaluated via simulated and realistically collected datasets. △ Less

Submitted 22 August, 2022; v1 submitted 31 August, 2021; originally announced September 2021.

Comments: 8 pages, 10 figures; Presented at the 2021 International Conference on Indoor Positioning and Indoor Navigation (IPIN)

arXiv:1912.01240 [pdf]

doi 10.1038/s41467-020-20215-y

Lunar impact craters identification and age estimation with Chang'E data by deep and transfer learning

Authors: Chen Yang, Haishi Zhao, Lorenzo Bruzzone, Jon Atli Benediktsson, Yanchun Liang, Bin Liu, Xingguo Zeng, Renchu Guan, Chunlai Li, Ziyuan Ouyang

Abstract: Impact craters, as "lunar fossils", are the most dominant lunar surface features and occupy most of the Moon's surface. Their formation and evolution record the history of the Solar System. Sixty years of triumphs in the lunar exploration projects accumulated a large amount of lunar data. Currently, there are 9137 existing recognized craters. However, only 1675 of them have been determined age, wh… ▽ More Impact craters, as "lunar fossils", are the most dominant lunar surface features and occupy most of the Moon's surface. Their formation and evolution record the history of the Solar System. Sixty years of triumphs in the lunar exploration projects accumulated a large amount of lunar data. Currently, there are 9137 existing recognized craters. However, only 1675 of them have been determined age, which is obviously not satisfactory to reveal the evolution of the Moon. Identifying craters is a challenging task due to their enormous difference in size, large variations in shape and vast presence. Furthermore, estimating the age of craters is extraordinarily difficult due to their complex and different morphologies. Here, in order to effectively identify craters and estimate their age, we convert the crater identification problem into a target detection task and crater age estimation into a taxonomy structure. From an initial small number of available craters, we progressively identify craters and estimate their age from Chang'E data by transfer learning (TL) using deep neural networks. For comprehensive identification of multi-scale craters, a two-stage craters detection approach is developed. Thus 117240 unrecognized lunar craters that range in diameter from 532 km to 1 km are identified. Then, a two-stage classification approach is developed to estimate the age of craters by simultaneously extracting their morphological features and stratigraphic information. The age of 79243 craters larger than 3 km in diameter is estimated. These identified and aged craters throughout the mid and low-latitude regions of the Moon are crucial for reconstructing the dynamic evolution process of the Solar System. △ Less

Submitted 3 December, 2019; originally announced December 2019.

Comments: 6 pages, 3 figures

Journal ref: Nat Commun 11, 6358 (2020)

arXiv:1902.03700 [pdf, other]

Accelerating Partial Evaluation in Distributed SPARQL Query Evaluation

Authors: Peng Peng, Lei Zou, Runyu Guan

Abstract: Partial evaluation has recently been used for processing SPARQL queries over a large resource description framework (RDF) graph in a distributed environment. However, the previous approach is inefficient when dealing with complex queries. In this study, we further improve the "partial evaluation and assembly" framework for answering SPARQL queries over a distributed RDF graph, while providing perf… ▽ More Partial evaluation has recently been used for processing SPARQL queries over a large resource description framework (RDF) graph in a distributed environment. However, the previous approach is inefficient when dealing with complex queries. In this study, we further improve the "partial evaluation and assembly" framework for answering SPARQL queries over a distributed RDF graph, while providing performance guarantees. Our key idea is to explore the intrinsic structural characteristics of partial matches to filter out irrelevant partial results, while providing performance guarantees on a network trace (data shipment) or the computational cost (response time). We also propose an efficient assembly algorithm to utilize the characteristics of partial matches to merge them and form final results. To improve the efficiency of finding partial matches further, we propose an optimization that communicates variables' candidates among sites to avoid redundant computations. In addition, although our approach is partitioning-tolerant, different partitioning strategies result in different performances, and we evaluate different partitioning strategies for our approach. Experiments over both real and synthetic RDF datasets confirm the superiority of our approach. △ Less

Submitted 15 February, 2019; v1 submitted 10 February, 2019; originally announced February 2019.

Comments: 15 pages

arXiv:1810.05970 [pdf, ps, other]

doi 10.1142/S0217751X18502172

Analysis of the strong vertices of $Σ_cND^{*}$ and $Σ_bNB^{*}$ in QCD sum rules

Authors: Guo-Liang Yu, Rong-Hua Guan, Zhi-Gang Wang

Abstract: The strong coupling constant is an important parameter which can help us to understand the strong decay behaviors of baryons. In our previous work, we have analyzed strong vertices $Σ_{c}^{*}ND$, $Σ_{b}^{*}NB$, $Σ_{c}ND$, $Σ_{b}NB$ in QCD sum rules. Following these work, we further analyze the strong vertices $Σ_{c}ND^{*}$ and $Σ_{b}NB^{*}$ using the three-point QCD sum rules under Dirac structure… ▽ More The strong coupling constant is an important parameter which can help us to understand the strong decay behaviors of baryons. In our previous work, we have analyzed strong vertices $Σ_{c}^{*}ND$, $Σ_{b}^{*}NB$, $Σ_{c}ND$, $Σ_{b}NB$ in QCD sum rules. Following these work, we further analyze the strong vertices $Σ_{c}ND^{*}$ and $Σ_{b}NB^{*}$ using the three-point QCD sum rules under Dirac structures $q\!\!\!/p\!\!\!/γ_α$ and $q\!\!\!/p\!\!\!/p_α$. In this work, we first calculate strong form factors considering contributions of the perturbative part and the condensate terms $\langle\overline{q}q\rangle$, $\langle\frac{α_{s}}πGG\rangle$ and $\langle\overline{q}g_{s}σGq\rangle$. Then, these form factors are used to fit into analytical functions. According to these functions, we finally determine the values of the strong coupling constants for these two vertices $Σ_{c}ND^{*}$ and $Σ_{b}NB^{*}$. △ Less

Submitted 14 October, 2018; originally announced October 2018.

Comments: arXiv admin note: text overlap with arXiv:1705.03229

Journal ref: Int.J.Mod.Phys.A 32 (2017) 35, 1750203

arXiv:1611.07232 [pdf, ps, other]

Compositional Learning of Relation Path Embedding for Knowledge Base Completion

Authors: Xixun Lin, Yanchun Liang, Fausto Giunchiglia, Xiaoyue Feng, Renchu Guan

Abstract: Large-scale knowledge bases have currently reached impressive sizes; however, these knowledge bases are still far from complete. In addition, most of the existing methods for knowledge base completion only consider the direct links between entities, ignoring the vital impact of the consistent semantics of relation paths. In this paper, we study the problem of how to better embed entities and relat… ▽ More Large-scale knowledge bases have currently reached impressive sizes; however, these knowledge bases are still far from complete. In addition, most of the existing methods for knowledge base completion only consider the direct links between entities, ignoring the vital impact of the consistent semantics of relation paths. In this paper, we study the problem of how to better embed entities and relations of knowledge bases into different low-dimensional spaces by taking full advantage of the additional semantics of relation paths, and we propose a compositional learning model of relation path embedding (RPE). Specifically, with the corresponding relation and path projections, RPE can simultaneously embed each entity into two types of latent spaces. It is also proposed that type constraints could be extended from traditional relation-specific constraints to the new proposed path-specific constraints. The results of experiments show that the proposed model achieves significant and consistent improvements compared with the state-of-the-art algorithms. △ Less

Submitted 23 February, 2017; v1 submitted 22 November, 2016; originally announced November 2016.

Comments: 7 pages,1 figure

Showing 1–38 of 38 results for author: guan, R