-
ScreenTK: Seamless Detection of Time-Killing Moments Using Continuous Mobile Screen Text Monitoring
Authors:
Le Fang,
Shiquan Zhang,
Hong Jia,
Jorge Goncalves,
Vassilis Kostakos
Abstract:
Smartphones have become essential to people's digital lives, providing a continuous stream of information and connectivity. However, this constant flow can lead to moments where users are simply passing time rather than engaging meaningfully. This underscores the importance of develo** methods to identify these "time-killing" moments, enabling the delivery of important notifications in a way tha…
▽ More
Smartphones have become essential to people's digital lives, providing a continuous stream of information and connectivity. However, this constant flow can lead to moments where users are simply passing time rather than engaging meaningfully. This underscores the importance of develo** methods to identify these "time-killing" moments, enabling the delivery of important notifications in a way that minimizes interruptions and enhances user engagement. Recent work has utilized screenshots taken every 5 seconds to detect time-killing activities on smartphones. However, this method often misses to capture phone usage between intervals. We demonstrate that up to 50% of time-killing instances go undetected using screenshots, leading to substantial gaps in understanding user behavior. To address this limitation, we propose a method called ScreenTK that detects time-killing moments by leveraging continuous screen text monitoring and on-device large language models (LLMs). Screen text contains more comprehensive information than screenshots and allows LLMs to summarize detailed phone usage. To verify our framework, we conducted experiments with six participants, capturing 1,034 records of different time-killing moments. Initial results show that our framework outperforms state-of-the-art solutions by 38% in our case study.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
A Radiometric Correction based Optical Modeling Approach to Removing Reflection Noise in TLS Point Clouds of Urban Scenes
Authors:
Li Fang,
Tianyu Li,
Yanghong Lin,
Shudong Zhou,
Wei Yao
Abstract:
Point clouds are vital in computer vision tasks such as 3D reconstruction, autonomous driving, and robotics. However, TLS-acquired point clouds often contain virtual points from reflective surfaces, causing disruptions. This study presents a reflection noise elimination algorithm for TLS point clouds. Our innovative reflection plane detection algorithm, based on geometry-optical models and physica…
▽ More
Point clouds are vital in computer vision tasks such as 3D reconstruction, autonomous driving, and robotics. However, TLS-acquired point clouds often contain virtual points from reflective surfaces, causing disruptions. This study presents a reflection noise elimination algorithm for TLS point clouds. Our innovative reflection plane detection algorithm, based on geometry-optical models and physical properties, identifies and categorizes reflection points per optical reflection theory. We've adapted the LSFH feature descriptor to retain reflection features, mitigating interference from symmetrical architectural structures. By incorporating the Hausdorff feature distance, the algorithm enhances resilience to ghosting and deformation, improving virtual point detection accuracy. Extensive experiments on the 3DRN benchmark dataset, featuring diverse urban environments with virtual TLS reflection noise, show our algorithm improves precision and recall rates for 3D points in reflective regions by 57.03\% and 31.80\%, respectively. Our method achieves a 9.17\% better outlier detection rate and 5.65\% higher accuracy than leading methods. Access the 3DRN dataset at (https://github.com/Tsuiky/3DRN).
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Diffusion Transformer Model With Compact Prior for Low-dose PET Reconstruction
Authors:
Bin Huang,
Xubiao Liu,
Lei Fang,
Qiegen Liu,
Bingxuan Li
Abstract:
Positron emission tomography (PET) is an advanced medical imaging technique that plays a crucial role in non-invasive clinical diagnosis. However, while reducing radiation exposure through low-dose PET scans is beneficial for patient safety, it often results in insufficient statistical data. This scarcity of data poses significant challenges for accurately reconstructing high-quality images, which…
▽ More
Positron emission tomography (PET) is an advanced medical imaging technique that plays a crucial role in non-invasive clinical diagnosis. However, while reducing radiation exposure through low-dose PET scans is beneficial for patient safety, it often results in insufficient statistical data. This scarcity of data poses significant challenges for accurately reconstructing high-quality images, which are essential for reliable diagnostic outcomes. In this research, we propose a diffusion transformer model (DTM) guided by joint compact prior (JCP) to enhance the reconstruction quality of low-dose PET imaging. In light of current research findings, we present a pioneering PET reconstruction model that integrates diffusion and transformer models for joint optimization. This model combines the powerful distribution map** abilities of diffusion models with the capacity of transformers to capture long-range dependencies, offering significant advantages for low-dose PET reconstruction. Additionally, the incorporation of the lesion refining block and penalized weighted least squares (PWLS) enhance the recovery capability of lesion regions and preserves detail information, solving blurring problems in lesion areas and texture details of most deep learning frameworks. Experimental results demonstrate the effectiveness of DTM in enhancing image quality and preserving critical clinical information for low-dose PET scans. Our approach not only reduces radiation exposure risks but also provides a more reliable PET imaging tool for early disease detection and patient management.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Long cycles and spectral radii in planar graphs
Authors:
** Xu,
Huiqiu Lin,
Longfei Fang
Abstract:
There is a rich history of studying the existence of cycles in planar graphs. The famous Tutte theorem on the Hamilton cycle states that every 4-connected planar graph contains a Hamilton cycle. Later on, Thomassen (1983), Thomas and Yu (1994) and Sanders (1996) respectively proved that every 4-connected planar graph contains a cycle of length $n-1, n-2$ and $n-3$. Chen, Fan and Yu (2004) further…
▽ More
There is a rich history of studying the existence of cycles in planar graphs. The famous Tutte theorem on the Hamilton cycle states that every 4-connected planar graph contains a Hamilton cycle. Later on, Thomassen (1983), Thomas and Yu (1994) and Sanders (1996) respectively proved that every 4-connected planar graph contains a cycle of length $n-1, n-2$ and $n-3$. Chen, Fan and Yu (2004) further conjectured that every 4-connected planar graph contains a cycle of length $\ell$ for $\ell\in\{n,n-1,\ldots,n-25\}$ and they verified that $\ell\in \{n-4, n-5, n-6\}$. When we remove the ``4-connected" condition, how to guarantee the existence of a long cycle in a planar graph? A natural question asks by adding a spectral radius condition: What is the smallest constant $C$ such that for sufficiently large $n$, every graph $G$ of order $n$ with spectral radius greater than $C$ contains a long cycle in a planar graph? In this paper, we give a stronger answer to the above question. Let $G$ be a planar graph with order $n\geq 1.8\times 10^{17}$ and $k\leq \lfloor\log_2(n-3)\rfloor-8$ be a non-negative integer, we show that if $ρ(G)\geq ρ(K_2\vee(P_{n-2k-4}\cup 2P_{k+1}))$ then $G$ contains a cycle of length $\ell$ for every $\ell\in \{n-k, n-k-1, \ldots, 3\}$ unless $G\cong K_2\vee(P_{n-2k-4}\cup 2P_{k+1})$.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Unleashing the Potential of Diffusion Models for Incomplete Data Imputation
Authors:
Hengrui Zhang,
Liancheng Fang,
Philip S. Yu
Abstract:
This paper introduces DiffPuter, an iterative method for missing data imputation that leverages the Expectation-Maximization (EM) algorithm and Diffusion Models. By treating missing data as hidden variables that can be updated during model training, we frame the missing data imputation task as an EM problem. During the M-step, DiffPuter employs a diffusion model to learn the joint distribution of…
▽ More
This paper introduces DiffPuter, an iterative method for missing data imputation that leverages the Expectation-Maximization (EM) algorithm and Diffusion Models. By treating missing data as hidden variables that can be updated during model training, we frame the missing data imputation task as an EM problem. During the M-step, DiffPuter employs a diffusion model to learn the joint distribution of both the observed and currently estimated missing data. In the E-step, DiffPuter re-estimates the missing data based on the conditional probability given the observed data, utilizing the diffusion model learned in the M-step. Starting with an initial imputation, DiffPuter alternates between the M-step and E-step until convergence. Through this iterative process, DiffPuter progressively refines the complete data distribution, yielding increasingly accurate estimations of the missing data. Our theoretical analysis demonstrates that the unconditional training and conditional sampling processes of the diffusion model align precisely with the objectives of the M-step and E-step, respectively. Empirical evaluations across 10 diverse datasets and comparisons with 16 different imputation methods highlight DiffPuter's superior performance. Notably, DiffPuter achieves an average improvement of 8.10% in MAE and 5.64% in RMSE compared to the most competitive existing method.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Authors:
Linjiajie Fang,
Ruoxue Liu,
**g Zhang,
Wenjia Wang,
Bing-Yi **g
Abstract:
In offline reinforcement learning (RL), it is necessary to manage out-of-distribution actions to prevent overestimation of value functions. Policy-regularized methods address this problem by constraining the target policy to stay close to the behavior policy. Although several approaches suggest representing the behavior policy as an expressive diffusion model to boost performance, it remains uncle…
▽ More
In offline reinforcement learning (RL), it is necessary to manage out-of-distribution actions to prevent overestimation of value functions. Policy-regularized methods address this problem by constraining the target policy to stay close to the behavior policy. Although several approaches suggest representing the behavior policy as an expressive diffusion model to boost performance, it remains unclear how to regularize the target policy given a diffusion-modeled behavior sampler. In this paper, we propose Diffusion Actor-Critic (DAC) that formulates the Kullback-Leibler (KL) constraint policy iteration as a diffusion noise regression problem, enabling direct representation of target policies as diffusion models. Our approach follows the actor-critic learning paradigm that we alternatively train a diffusion-modeled target policy and a critic network. The actor training loss includes a soft Q-guidance term from the Q-gradient. The soft Q-guidance grounds on the theoretical solution of the KL constraint policy iteration, which prevents the learned policy from taking out-of-distribution actions. For critic training, we train a Q-ensemble to stabilize the estimation of Q-gradient. Additionally, DAC employs lower confidence bound (LCB) to address the overestimation and underestimation of value targets due to function approximation error. Our approach is evaluated on the D4RL benchmarks and outperforms the state-of-the-art in almost all environments. Code is available at \href{https://github.com/Fang-Lin93/DAC}{\texttt{github.com/Fang-Lin93/DAC}}.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
SIG: Efficient Self-Interpretable Graph Neural Network for Continuous-time Dynamic Graphs
Authors:
Lanting Fang,
Yulian Yang,
Kai Wang,
Shanshan Feng,
Kaiyu Feng,
Jie Gui,
Shuliang Wang,
Yew-Soon Ong
Abstract:
While dynamic graph neural networks have shown promise in various applications, explaining their predictions on continuous-time dynamic graphs (CTDGs) is difficult. This paper investigates a new research task: self-interpretable GNNs for CTDGs. We aim to predict future links within the dynamic graph while simultaneously providing causal explanations for these predictions. There are two key challen…
▽ More
While dynamic graph neural networks have shown promise in various applications, explaining their predictions on continuous-time dynamic graphs (CTDGs) is difficult. This paper investigates a new research task: self-interpretable GNNs for CTDGs. We aim to predict future links within the dynamic graph while simultaneously providing causal explanations for these predictions. There are two key challenges: (1) capturing the underlying structural and temporal information that remains consistent across both independent and identically distributed (IID) and out-of-distribution (OOD) data, and (2) efficiently generating high-quality link prediction results and explanations. To tackle these challenges, we propose a novel causal inference model, namely the Independent and Confounded Causal Model (ICCM). ICCM is then integrated into a deep learning architecture that considers both effectiveness and efficiency. Extensive experiments demonstrate that our proposed model significantly outperforms existing methods across link prediction accuracy, explanation quality, and robustness to shortcut features. Our code and datasets are anonymously released at https://github.com/2024SIG/SIG.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Blow-up criterion for a three-dimensional compressible non-Newtonian fluid with vacuum
Authors:
Junyuan Guo,
Li Fang
Abstract:
This work is devoted to establish an improved blow-up criterion for strong solutions to a three-dimensional compressible non-Newtonian fluid with vacuum. The considered system is the Power Law model in a bounded periodic domain in R^3.We establish a blow-up criterion for the local strong solutions in terms of the L^4(0,T;L^{\infty}(Ω))norm of the gradient of the velocity for any power-law index q…
▽ More
This work is devoted to establish an improved blow-up criterion for strong solutions to a three-dimensional compressible non-Newtonian fluid with vacuum. The considered system is the Power Law model in a bounded periodic domain in R^3.We establish a blow-up criterion for the local strong solutions in terms of the L^4(0,T;L^{\infty}(Ω))norm of the gradient of the velocity for any power-law index q is greater than 1.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models
Authors:
Shuo Liu,
Di Yao,
Lanting Fang,
Zhetao Li,
Wenbin Li,
Kaiyu Feng,
XiaoWen Ji,
**g** Bi
Abstract:
Detecting anomaly edges for dynamic graphs aims to identify edges significantly deviating from the normal pattern and can be applied in various domains, such as cybersecurity, financial transactions and AIOps. With the evolving of time, the types of anomaly edges are emerging and the labeled anomaly samples are few for each type. Current methods are either designed to detect randomly inserted edge…
▽ More
Detecting anomaly edges for dynamic graphs aims to identify edges significantly deviating from the normal pattern and can be applied in various domains, such as cybersecurity, financial transactions and AIOps. With the evolving of time, the types of anomaly edges are emerging and the labeled anomaly samples are few for each type. Current methods are either designed to detect randomly inserted edges or require sufficient labeled data for model training, which harms their applicability for real-world applications. In this paper, we study this problem by cooperating with the rich knowledge encoded in large language models(LLMs) and propose a method, namely AnomalyLLM. To align the dynamic graph with LLMs, AnomalyLLM pre-trains a dynamic-aware encoder to generate the representations of edges and reprograms the edges using the prototypes of word embeddings. Along with the encoder, we design an in-context learning framework that integrates the information of a few labeled samples to achieve few-shot anomaly detection. Experiments on four datasets reveal that AnomalyLLM can not only significantly improve the performance of few-shot anomaly detection, but also achieve superior results on new anomalies without any update of model parameters.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
A De-singularity Subgradient Approach for the Extended Weber Location Problem
Authors:
Zhao-Rong Lai,
Xiaotian Wu,
Liangda Fang,
Ziliang Chen
Abstract:
The extended Weber location problem is a classical optimization problem that has inspired some new works in several machine learning scenarios recently. However, most existing algorithms may get stuck due to the singularity at the data points when the power of the cost function $1\leqslant q<2$, such as the widely-used iterative Weiszfeld approach. In this paper, we establish a de-singularity subg…
▽ More
The extended Weber location problem is a classical optimization problem that has inspired some new works in several machine learning scenarios recently. However, most existing algorithms may get stuck due to the singularity at the data points when the power of the cost function $1\leqslant q<2$, such as the widely-used iterative Weiszfeld approach. In this paper, we establish a de-singularity subgradient approach for this problem. We also provide a complete proof of convergence which has fixed some incomplete statements of the proofs for some previous Weiszfeld algorithms. Moreover, we deduce a new theoretical result of superlinear convergence for the iteration sequence in a special case where the minimum point is a singular point. We conduct extensive experiments in a real-world machine learning scenario to show that the proposed approach solves the singularity problem, produces the same results as in the non-singularity cases, and shows a reasonable rate of linear convergence. The results also indicate that the $q$-th power case ($1<q<2$) is more advantageous than the $1$-st power case and the $2$-nd power case in some situations. Hence the de-singularity subgradient approach is beneficial to advancing both theory and practice for the extended Weber location problem.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction
Authors:
Henry Peng Zou,
Vinay Samuel,
Yue Zhou,
Weizhi Zhang,
Liancheng Fang,
Zihe Song,
Philip S. Yu,
Cornelia Caragea
Abstract:
Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction.…
▽ More
Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction. ImplicitAVE, sourced from the MAVE dataset, is carefully curated and expanded to include implicit AVE and multimodality, resulting in a refined dataset of 68k training and 1.6k testing data across five domains. We also explore the application of multimodal large language models (MLLMs) to implicit AVE, establishing a comprehensive benchmark for MLLMs on the ImplicitAVE dataset. Six recent MLLMs with eleven variants are evaluated across diverse settings, revealing that implicit value extraction remains a challenging task for MLLMs. The contributions of this work include the development and release of ImplicitAVE, and the exploration and benchmarking of various MLLMs for implicit AVE, providing valuable insights and potential future research directions. Dataset and code are available at https://github.com/HenryPengZou/ImplicitAVE
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Eigenvalues and graph minors
Authors:
Mingqing Zhai,
Longfei Fang,
Huiqiu Lin
Abstract:
Let $spex(n,H_{minor})$ denote the maximum spectral radius of $n$-vertex $H$-minor free graphs. The problem on determining this extremal value can be dated back to the early 1990s. Up to now, it has been solved for $n$ sufficiently large and some special minors, such as $\{K_{2,3},K_4\}$, $\{K_{3,3},K_5\}$, $K_r$ and $K_{s,t}$. In this paper, we find some unified phenomena on general minors. Every…
▽ More
Let $spex(n,H_{minor})$ denote the maximum spectral radius of $n$-vertex $H$-minor free graphs. The problem on determining this extremal value can be dated back to the early 1990s. Up to now, it has been solved for $n$ sufficiently large and some special minors, such as $\{K_{2,3},K_4\}$, $\{K_{3,3},K_5\}$, $K_r$ and $K_{s,t}$. In this paper, we find some unified phenomena on general minors. Every graph $G$ on $n$ vertices with spectral radius $ρ\geq spex(n,H_{minor})$ contains either an $H$ minor or a spanning book $K_{γ_H}\nabla(n-γ_H)K_1$, where $γ_H=|H|-α(H)-1$. Furthermore, assume that $G$ is $H$-minor free and $Γ^*_s(H)$ is the family of $s$-vertex irreducible induced subgraphs of $H$, then $G$ minus its $γ_H$ dominating vertices is $Γ^*_{α(H)+1}(H)$-minor saturate, and it is further edge-maximal if $Γ^*_{α(H)+1}(H)$ is a connected family. As applications, we obtain some known results on minors mentioned above. We also determine the extremal values for some other minors, such as flowers, wheels, generalized books and complete multi-partite graphs. Our results extend some conjectures on planar graphs, outer-planar graphs and $K_{s,t}$-minor free graphs. To obtain the results, we combine stability method, spectral techniques and structural analyses. Especially, we give an exploration of using absorbing method in spectral extremal problems.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Hyperspectral Anomaly Detection with Self-Supervised Anomaly Prior
Authors:
Yidan Liu,
Weiying Xie,
Kai Jiang,
Jiaqing Zhang,
Yunsong Li,
Leyuan Fang
Abstract:
The majority of existing hyperspectral anomaly detection (HAD) methods use the low-rank representation (LRR) model to separate the background and anomaly components, where the anomaly component is optimized by handcrafted sparse priors (e.g., $\ell_{2,1}$-norm). However, this may not be ideal since they overlook the spatial structure present in anomalies and make the detection result largely depen…
▽ More
The majority of existing hyperspectral anomaly detection (HAD) methods use the low-rank representation (LRR) model to separate the background and anomaly components, where the anomaly component is optimized by handcrafted sparse priors (e.g., $\ell_{2,1}$-norm). However, this may not be ideal since they overlook the spatial structure present in anomalies and make the detection result largely dependent on manually set sparsity. To tackle these problems, we redefine the optimization criterion for the anomaly component in the LRR model with a self-supervised network called self-supervised anomaly prior (SAP). This prior is obtained by the pretext task of self-supervised learning, which is customized to learn the characteristics of hyperspectral anomalies. Specifically, this pretext task is a classification task to distinguish the original hyperspectral image (HSI) and the pseudo-anomaly HSI, where the pseudo-anomaly is generated from the original HSI and designed as a prism with arbitrary polygon bases and arbitrary spectral bands. In addition, a dual-purified strategy is proposed to provide a more refined background representation with an enriched background dictionary, facilitating the separation of anomalies from complex backgrounds. Extensive experiments on various hyperspectral datasets demonstrate that the proposed SAP offers a more accurate and interpretable solution than other advanced HAD methods.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Turán numbers for non-bipartite graphs and applications to spectral extremal problems
Authors:
Longfei Fang,
Michael Tait,
Mingqing Zhai
Abstract:
Given a graph family $\mathcal{H}$ with $\min_{H\in \mathcal{H}}χ(H)=r+1\geq 3$. Let ${\rm ex}(n,\mathcal{H})$ and ${\rm spex}(n,\mathcal{H})$ be the maximum number of edges and the maximum spectral radius of the adjacency matrix over all $\mathcal{H}$-free graphs of order $n$, respectively. Denote by ${\rm EX}(n,\mathcal{H})$ (resp. ${\rm SPEX}(n,\mathcal{H})$) the set of extremal graphs with res…
▽ More
Given a graph family $\mathcal{H}$ with $\min_{H\in \mathcal{H}}χ(H)=r+1\geq 3$. Let ${\rm ex}(n,\mathcal{H})$ and ${\rm spex}(n,\mathcal{H})$ be the maximum number of edges and the maximum spectral radius of the adjacency matrix over all $\mathcal{H}$-free graphs of order $n$, respectively. Denote by ${\rm EX}(n,\mathcal{H})$ (resp. ${\rm SPEX}(n,\mathcal{H})$) the set of extremal graphs with respect to ${\rm ex}(n,\mathcal{H})$ (resp. ${\rm spex}(n,\mathcal{H})$).
In this paper, we use a decomposition family defined by Simonovits to give a characterization of which graph families $\mathcal{H}$ satisfy ${\rm ex}(n,\mathcal{H})<e(T_{n,r})+\lfloor \frac{n}{2r} \rfloor$. Furthermore, we completely determine ${\rm EX}\big(n,\mathbb{G}(F_1,\ldots,F_k)\big)$ for $n$ sufficiently large, where $\mathbb{G}(F_1,\ldots,F_k)$ denotes a finite graph family which consists of $k$ edge-disjoint $(r+1)$-chromatic color-critical graphs $F_1,\ldots,F_k$. This result strengthens a theorem of Győri, who settled the case that $F_1=\cdots =F_k = K_{r+1}$.
Wang, Kang and Xue %[J. Combin. Theory Ser. B 159 (2023) 20--41] proved that ${\rm SPEX}(n,H)\subseteq {\rm EX}(n,H)$ for $n$ sufficiently large and any graph $H$ with ${\rm ex}(n,H)=e(T_{n,r})+O(1)$. As an application of our first theorem, we show that ${\rm SPEX}(n,\mathcal{H})\subseteq {\rm EX}(n,\mathcal{H})$ for $n$ sufficiently large and any finite family $\mathcal{H}$ with ${\rm ex}(n,\mathcal{H})<e(T_{n,r})+\lfloor \frac{n}{2r}\rfloor$. As an application of our second theorem we completely determine ${\rm SPEX}\big(n,\mathbb{G}(F_1,\ldots,F_k)\big)$ for $n$ sufficiently large.
Finally, related problems are proposed for further research.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Diffusion Models Meet Remote Sensing: Principles, Methods, and Perspectives
Authors:
Yidan Liu,
Jun Yue,
Shaobo Xia,
Pedram Ghamisi,
Weiying Xie,
Leyuan Fang
Abstract:
As a newly emerging advance in deep generative models, diffusion models have achieved state-of-the-art results in many fields, including computer vision, natural language processing, and molecule design. The remote sensing community has also noticed the powerful ability of diffusion models and quickly applied them to a variety of tasks for image processing. Given the rapid increase in research on…
▽ More
As a newly emerging advance in deep generative models, diffusion models have achieved state-of-the-art results in many fields, including computer vision, natural language processing, and molecule design. The remote sensing community has also noticed the powerful ability of diffusion models and quickly applied them to a variety of tasks for image processing. Given the rapid increase in research on diffusion models in the field of remote sensing, it is necessary to conduct a comprehensive review of existing diffusion model-based remote sensing papers, to help researchers recognize the potential of diffusion models and provide some directions for further exploration. Specifically, this paper first introduces the theoretical background of diffusion models, and then systematically reviews the applications of diffusion models in remote sensing, including image generation, enhancement, and interpretation. Finally, the limitations of existing remote sensing diffusion models and worthy research directions for further exploration are discussed and summarized.
△ Less
Submitted 17 April, 2024; v1 submitted 13 April, 2024;
originally announced April 2024.
-
XScale-NVS: Cross-Scale Novel View Synthesis with Hash Featurized Manifold
Authors:
Guangyu Wang,
**zhi Zhang,
Fan Wang,
Ruqi Huang,
Lu Fang
Abstract:
We propose XScale-NVS for high-fidelity cross-scale novel view synthesis of real-world large-scale scenes. Existing representations based on explicit surface suffer from discretization resolution or UV distortion, while implicit volumetric representations lack scalability for large scenes due to the dispersed weight distribution and surface ambiguity. In light of the above challenges, we introduce…
▽ More
We propose XScale-NVS for high-fidelity cross-scale novel view synthesis of real-world large-scale scenes. Existing representations based on explicit surface suffer from discretization resolution or UV distortion, while implicit volumetric representations lack scalability for large scenes due to the dispersed weight distribution and surface ambiguity. In light of the above challenges, we introduce hash featurized manifold, a novel hash-based featurization coupled with a deferred neural rendering framework. This approach fully unlocks the expressivity of the representation by explicitly concentrating the hash entries on the 2D manifold, thus effectively representing highly detailed contents independent of the discretization resolution. We also introduce a novel dataset, namely GigaNVS, to benchmark cross-scale, high-resolution novel view synthesis of realworld large-scale scenes. Our method significantly outperforms competing baselines on various real-world scenes, yielding an average LPIPS that is 40% lower than prior state-of-the-art on the challenging GigaNVS benchmark. Please see our project page at: xscalenvs.github.io.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Enhanced Bayesian Personalized Ranking for Robust Hard Negative Sampling in Recommender Systems
Authors:
Kexin Shi,
**g Zhang,
Linjiajie Fang,
Wenjia Wang,
Bingyi **g
Abstract:
In implicit collaborative filtering, hard negative mining techniques are developed to accelerate and enhance the recommendation model learning. However, the inadvertent selection of false negatives remains a major concern in hard negative sampling, as these false negatives can provide incorrect information and mislead the model learning. To date, only a small number of studies have been committed…
▽ More
In implicit collaborative filtering, hard negative mining techniques are developed to accelerate and enhance the recommendation model learning. However, the inadvertent selection of false negatives remains a major concern in hard negative sampling, as these false negatives can provide incorrect information and mislead the model learning. To date, only a small number of studies have been committed to solve the false negative problem, primarily focusing on designing sophisticated sampling algorithms to filter false negatives. In contrast, this paper shifts its focus to refining the loss function. We find that the original Bayesian Personalized Ranking (BPR), initially designed for uniform negative sampling, is inadequate in adapting to hard sampling scenarios. Hence, we introduce an enhanced Bayesian Personalized Ranking objective, named as Hard-BPR, which is specifically crafted for dynamic hard negative sampling to mitigate the influence of false negatives. This method is simple yet efficient for real-world deployment. Extensive experiments conducted on three real-world datasets demonstrate the effectiveness and robustness of our approach, along with the enhanced ability to distinguish false negatives.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation
Authors:
Linshan Wu,
Zhun Zhong,
Jiayi Ma,
Yunchao Wei,
Hao Chen,
Leyuan Fang,
Shutao Li
Abstract:
Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while largely ignoring to leverage the inherent semantic correlation among different pseudo labels. We observe that pseudo-labeled pixels that are close to each…
▽ More
Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while largely ignoring to leverage the inherent semantic correlation among different pseudo labels. We observe that pseudo-labeled pixels that are close to each other in the feature space are more likely to share the same class, and those closer to the distribution centers tend to have higher confidence. Motivated by this, we propose to model the underlying label distributions and employ cross-label constraints to generate more accurate pseudo labels. In this paper, we develop a unified WSSS framework named Adaptive Gaussian Mixtures Model, which leverages a GMM to model the label distributions. Specifically, we calculate the feature distribution centers of pseudo-labeled pixels and build the GMM by measuring the distance between the centers and each pseudo-labeled pixel. Then, we introduce an Online Expectation-Maximization (OEM) algorithm and a novel maximization loss to optimize the GMM adaptively, aiming to learn more discriminative decision boundaries between different class-wise Gaussian mixtures. Based on the label distributions, we leverage the GMM to generate high-quality pseudo labels for more reliable supervision. Our framework is capable of solving different forms of weak labels: image-level labels, points, scribbles, blocks, and bounding-boxes. Extensive experiments on PASCAL, COCO, Cityscapes, and ADE20K datasets demonstrate that our framework can effectively provide more reliable supervision and outperform the state-of-the-art methods under all settings. Code will be available at https://github.com/Luffy03/AGMM-SASS.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience
Authors:
Xiaohang Yu,
Zhengxian Yang,
Shi Pan,
Yuqi Han,
Haoxiang Wang,
Jun Zhang,
Shi Yan,
Borong Lin,
Lei Yang,
Tao Yu,
Lu Fang
Abstract:
We have built a custom mobile multi-camera large-space dense light field capture system, which provides a series of high-quality and sufficiently dense light field images for various scenarios. Our aim is to contribute to the development of popular 3D scene reconstruction algorithms such as IBRnet, NeRF, and 3D Gaussian splitting. More importantly, the collected dataset, which is much denser than…
▽ More
We have built a custom mobile multi-camera large-space dense light field capture system, which provides a series of high-quality and sufficiently dense light field images for various scenarios. Our aim is to contribute to the development of popular 3D scene reconstruction algorithms such as IBRnet, NeRF, and 3D Gaussian splitting. More importantly, the collected dataset, which is much denser than existing datasets, may also inspire space-oriented light field reconstruction, which is potentially different from object-centric 3D reconstruction, for immersive VR/AR experiences. We utilized a total of 40 GoPro 10 cameras, capturing images of 5k resolution. The number of photos captured for each scene is no less than 1000, and the average density (view number within a unit sphere) is 134.68. It is also worth noting that our system is capable of efficiently capturing large outdoor scenes. Addressing the current lack of large-space and dense light field datasets, we made efforts to include elements such as sky, reflections, lights and shadows that are of interest to researchers in the field of 3D reconstruction during the data capture process. Finally, we validated the effectiveness of our provided dataset on three popular algorithms and also integrated the reconstructed 3DGS results into the Unity engine, demonstrating the potential of utilizing our datasets to enhance the realism of virtual reality (VR) and create feasible interactive spaces. The dataset is available at our project website.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Microcavity induced by few-layer GaSe crystal on silicon photonic crystal waveguide for efficient optical frequency conversion
Authors:
Xiaoqing Chen,
Yanyan Zhang,
Yingke Ji,
Yu Zhang,
Jianguo Wang,
Xianghu Wu,
Chenyang Zhao,
Liang Fang,
Biqiang Jiang,
Jianlin Zhao,
Xuetao Gan
Abstract:
We demonstrate the post-induction of high-quality microcavity on silicon photonic crystal (PC) waveguide by integrating few-layer GaSe crystal, which promises highly efficient on-chip optical frequency conversions. The integration of GaSe shifts the dispersion bands of the PC waveguide mode into the bandgap, resulting in localized modes confined by the bare PC waveguides. Thanks to the small contr…
▽ More
We demonstrate the post-induction of high-quality microcavity on silicon photonic crystal (PC) waveguide by integrating few-layer GaSe crystal, which promises highly efficient on-chip optical frequency conversions. The integration of GaSe shifts the dispersion bands of the PC waveguide mode into the bandgap, resulting in localized modes confined by the bare PC waveguides. Thanks to the small contrast of refractive index at the boundaries of microcavity, it is reliably to obtain quality (Q) factors exceeding 10^4. With the enhanced light-GaSe interaction by the microcavity modes and high second-order nonlinearity of GaSe, remarkable second-harmonic generation (SHG) and sum-frequency generation (SFG) are achieved. A record-high on-chip SHG conversion efficiency of 131100% W^-1 is obtained, enabling the clear SHG imaging of the resonant modes with the pump of sub-milliwatts continuous-wave (CW) laser. Driven by a pump of on-resonance CW laser, strong SFGs are successfully carried out with the other pump of a CW laser spanning over the broad telecom-band. Broadband frequency conversion of an incoherent superluminescent light-emitting diode with low spectral power density is also realized in the integrated GaSe-PC waveguide. Our results are expected to provide new strategies for high-efficiency light-matter interactions, nonlinear photonics and light source generation in silicon photonic integrated circuits.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
NiteDR: Nighttime Image De-Raining with Cross-View Sensor Cooperative Learning for Dynamic Driving Scenes
Authors:
Cidan Shi,
Lihuang Fang,
Han Wu,
Xiaoyu Xian,
Yukai Shi,
Liang Lin
Abstract:
In real-world environments, outdoor imaging systems are often affected by disturbances such as rain degradation. Especially, in nighttime driving scenes, insufficient and uneven lighting shrouds the scenes in darkness, resulting degradation of both the image quality and visibility. Particularly, in the field of autonomous driving, the visual perception ability of RGB sensors experiences a sharp de…
▽ More
In real-world environments, outdoor imaging systems are often affected by disturbances such as rain degradation. Especially, in nighttime driving scenes, insufficient and uneven lighting shrouds the scenes in darkness, resulting degradation of both the image quality and visibility. Particularly, in the field of autonomous driving, the visual perception ability of RGB sensors experiences a sharp decline in such harsh scenarios. Additionally, driving assistance systems suffer from reduced capabilities in capturing and discerning the surrounding environment, posing a threat to driving safety. Single-view information captured by single-modal sensors cannot comprehensively depict the entire scene. To address these challenges, we developed an image de-raining framework tailored for rainy nighttime driving scenes. It aims to remove rain artifacts, enrich scene representation, and restore useful information. Specifically, we introduce cooperative learning between visible and infrared images captured by different sensors. By cross-view fusion of these multi-source data, the scene within the images gains richer texture details and enhanced contrast. We constructed an information cleaning module called CleanNet as the first stage of our framework. Moreover, we designed an information fusion module called FusionNet as the second stage to fuse the clean visible images with infrared images. Using this stage-by-stage learning strategy, we obtain de-rained fusion images with higher quality and better visual perception. Extensive experiments demonstrate the effectiveness of our proposed Cross-View Cooperative Learning (CVCL) in adverse driving scenarios in low-light rainy environments. The proposed approach addresses the gap in the utilization of existing rain removal algorithms in specific low-light conditions.
△ Less
Submitted 7 April, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Nonlinear photodetector based on InSe p-n homojunction for improving spatial imaging resolution
Authors:
Yu Zhang,
Xiaoqing Chen,
Mingwen Zhang,
Xianghu Wu,
Jianguo Wang,
Ruijuan Tian,
Liang Fang,
Yanyan Zhang,
Jianlin Zhao,
Xuetao Gan
Abstract:
We demonstrate an efficient nonlinear photodetector (NLPD) with quadratic response based on a few-layer InSe p-n homojunction, which is beneficial from the strong second harmonic generation (SHG) process in InSe and effective harvest of photocarriers actuated by the high-quality homojunction. The NLPD can sense light with photon energy smaller than InSe electronic bandgap because the SHG process i…
▽ More
We demonstrate an efficient nonlinear photodetector (NLPD) with quadratic response based on a few-layer InSe p-n homojunction, which is beneficial from the strong second harmonic generation (SHG) process in InSe and effective harvest of photocarriers actuated by the high-quality homojunction. The NLPD can sense light with photon energy smaller than InSe electronic bandgap because the SHG process in InSe doubles the frequency of incident light, extending InSe photodetection wavelength range to 1750 nm. The InSe p-n homojunction, which is electrostatically doped by two split back gates, presents a rectification ratio exceeding 106 with a dark current down to 2 pA and a high normalized responsivity of 0.534 A/W2 for the telecom-band pulsed light at 1550 nm. The photocurrents of the SHG-assisted photodetection have a quadratic dependence on the optical powers, making the NLPD highly sensitive to light intensity variation with improved spatial resolution. As examples, the NLPD is employed to precisely determine the localization point of a focused laser beam waist and implement spatial imaging with an improved resolution compared with the linear photodetector. These features highlight the potential of the proposed NLPD in develo** advanced optical sensing and imaging systems.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
Compact on-chip power splitter based on topological photonic crystal
Authors:
Puhui Zhang,
Jiacheng Zhang,
Linpeng Gu,
Liang Fang,
Yanyan Zhang,
Jianlin ZHao,
Xuetao Gan
Abstract:
We propose and demonstrate an on-chip 1*N power splitter based on topological photonic crystal (TPC) on a monolithic silicon photonic platform. Benefiting from the valley-locked propagation mode at the interface of TPCs with different topological phases, the proposed power splitter has negligible backscattering around the sharp bendings and good robustness to fabrication defects, which therefore e…
▽ More
We propose and demonstrate an on-chip 1*N power splitter based on topological photonic crystal (TPC) on a monolithic silicon photonic platform. Benefiting from the valley-locked propagation mode at the interface of TPCs with different topological phases, the proposed power splitter has negligible backscattering around the sharp bendings and good robustness to fabrication defects, which therefore enable lower insertion loss, better uniformity, and more compact footprint than the conventional designs. For the fabricated 1*2 (8) power splitter, the uniformity among the output ports is below 0.35 (0.65) dB and the maximum insertion loss is 0.38 (0.58) dB with compact footprint of 5*5 um2 (10*12 um2) within a bandwidth of 70 nm. In addition, the topological power splitter only requires simple configurations of TPCs with different topological phases, which is more reliable in design and fabrication compared with the conventional designs.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Sliding-mediated ferroelectric phase transition in CuInP2S6 under pressure
Authors:
Zhou Zhou,
Jun-Jie Zhang,
Gemma F. Turner,
Stephen A. Moggach,
Yulia Lekina,
Samuel Morris,
Shun Wang,
Yiqi Hu,
Qiankun Li,
**shuo Xue,
Zhijian Feng,
Qingyu Yan,
Yuyan Weng,
Bin Xu,
Yong Fang,
Ze Xiang Shen,
Liang Fang,
Shuai Dong,
Lu You
Abstract:
Interlayer stacking order has recently emerged as a unique degree of freedom to control crystal symmetry and physical properties in two-dimensional van der Waals (vdW) materials and heterostructures. By tuning the layer stacking pattern, symmetry-breaking and electric polarization can be created in otherwise non-polar crystals, whose polarization reversal depends on the interlayer sliding motion.…
▽ More
Interlayer stacking order has recently emerged as a unique degree of freedom to control crystal symmetry and physical properties in two-dimensional van der Waals (vdW) materials and heterostructures. By tuning the layer stacking pattern, symmetry-breaking and electric polarization can be created in otherwise non-polar crystals, whose polarization reversal depends on the interlayer sliding motion. Herein, we demonstrate that in a vdW layered ferroelectric, its existing polarization is closely coupled to the interlayer sliding driven by hydrostatic pressure. Through combined structural, electrical, vibrational characterizations, and theoretical calculations, we clearly map out the structural evolution of CuInP2S6 under pressure. A tendency towards a high polarization state is observed in the low-pressure region, followed by an interlayer-sliding-mediated phase transition from a monoclinic to a trigonal phase. Along the transformation pathway, the displacive-instable Cu ion serves as a pivot point that regulates the interlayer interaction in response to external pressure. The rich phase diagram of CuInP2S6, which is enabled by stacking orders, sheds light on the physics of vdW ferroelectricity and opens an alternative route to tailoring long-range order in vdW layered crystals.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Integration of multiview microbiome data for deciphering microbiome-metabolome-disease pathways
Authors:
Lei Fang,
Yue Wang,
Chenglong Ye
Abstract:
The intricate interplay between host organisms and their gut microbiota has catalyzed research into the microbiome's role in disease, shedding light on novel aspects of disease pathogenesis. However, the mechanisms through which the microbiome exerts its influence on disease remain largely unclear. In this study, we first introduce a structural equation model to delineate the pathways connecting t…
▽ More
The intricate interplay between host organisms and their gut microbiota has catalyzed research into the microbiome's role in disease, shedding light on novel aspects of disease pathogenesis. However, the mechanisms through which the microbiome exerts its influence on disease remain largely unclear. In this study, we first introduce a structural equation model to delineate the pathways connecting the microbiome, metabolome, and disease processes, utilizing a target multiview microbiome data. To mitigate the challenges posed by hidden confounders, we further propose an integrative approach that incorporates data from an external microbiome cohort. This method also supports the identification of disease-specific and microbiome-associated metabolites that are missing in the target cohort. We provide theoretical underpinnings for the estimations derived from our integrative approach, demonstrating estimation consistency and asymptotic normality. The effectiveness of our methodologies is validated through comprehensive simulation studies and an empirical application to inflammatory bowel disease, highlighting their potential to unravel the complex relationships between the microbiome, metabolome, and disease.
△ Less
Submitted 16 February, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Arbitrarily configurable nonlinear topological modes
Authors:
Kai Bai,
Jia-Zheng Li,
Tian-Rui Liu,
Liang Fang,
Duanduan Wan,
Meng Xiao
Abstract:
Topological modes (TMs) are typically localized at boundaries, interfaces and dislocations, and exponentially decay into the bulk of a large enough lattice. Recently, the non-Hermitian skin effect has been leveraged to delocalize the wavefunctions of TMs from the boundary and thus to increase the capacity of TMs dramatically. Here, we explore the capability of nonlinearity in designing and reconfi…
▽ More
Topological modes (TMs) are typically localized at boundaries, interfaces and dislocations, and exponentially decay into the bulk of a large enough lattice. Recently, the non-Hermitian skin effect has been leveraged to delocalize the wavefunctions of TMs from the boundary and thus to increase the capacity of TMs dramatically. Here, we explore the capability of nonlinearity in designing and reconfiguring the wavefunctions of TMs. With growing intensity, wavefunctions of these in-gap nonlinear TMs undergo an initial deviation from exponential decay, gradually merge into arbitrarily designable plateaus, then encompass the entire nonlinear domain, and eventually concentrate at the nonlinear boundary. Intriguingly, such extended nonlinear TMs are still robust against defects and disorders, and stable in dynamics under external excitation. Advancing the conceptual understanding of the nonlinear TMs, our results open new avenues for increasing the capacity of TMs and develo** compact and reconfigurable topological devices.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
Artwork Protection Against Neural Style Transfer Using Locally Adaptive Adversarial Color Attack
Authors:
Zhongliang Guo,
Junhao Dong,
Yifei Qian,
Kaixuan Wang,
Weiye Li,
Ziheng Guo,
Yuheng Wang,
Yanli Li,
Ognjen Arandjelović,
Lei Fang
Abstract:
Neural style transfer (NST) generates new images by combining the style of one image with the content of another. However, unauthorized NST can exploit artwork, raising concerns about artists' rights and motivating the development of proactive protection methods. We propose Locally Adaptive Adversarial Color Attack (LAACA), empowering artists to protect their artwork from unauthorized style transf…
▽ More
Neural style transfer (NST) generates new images by combining the style of one image with the content of another. However, unauthorized NST can exploit artwork, raising concerns about artists' rights and motivating the development of proactive protection methods. We propose Locally Adaptive Adversarial Color Attack (LAACA), empowering artists to protect their artwork from unauthorized style transfer by processing before public release. By delving into the intricacies of human visual perception and the role of different frequency components, our method strategically introduces frequency-adaptive perturbations in the image. These perturbations significantly degrade the generation quality of NST while maintaining an acceptable level of visual change in the original image, ensuring that potential infringers are discouraged from using the protected artworks, because of its bad NST generation quality. Additionally, existing metrics often overlook the importance of color fidelity in evaluating color-mattered tasks, such as the quality of NST-generated images, which is crucial in the context of artistic works. To comprehensively assess the color-mattered tasks, we propose the Adversarial Color Distance Metric (ACDM), designed to quantify the color difference of images pre- and post-manipulations. Experimental results confirm that attacking NST using LAACA results in visually inferior style transfer, and the ACDM can efficiently measure color-mattered tasks. By providing artists with a tool to safeguard their intellectual property, our work relieves the socio-technical challenges posed by the misuse of NST in the art community.
△ Less
Submitted 19 April, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Spectral extremal results on trees
Authors:
Longfei Fang,
Huiqiu Lin,
**long Shu,
Zhiyuan Zhang
Abstract:
Let ${\rm spex}(n,F)$ be the maximum spectral radius over all $F$-free graphs of order $n$, and ${\rm SPEX}(n,F)$ be the family of $F$-free graphs of order $n$ with spectral radius equal to ${\rm spex}(n,F)$. Given integers $n,k,p$ with $n>k>0$ and $0\leq p\leq \lfloor(n-k)/2\rfloor$, let $S_{n,k}^{p}$ be the graph obtained from $K_k\nabla(n-k)K_1$ by embedding $p$ independent edges within its ind…
▽ More
Let ${\rm spex}(n,F)$ be the maximum spectral radius over all $F$-free graphs of order $n$, and ${\rm SPEX}(n,F)$ be the family of $F$-free graphs of order $n$ with spectral radius equal to ${\rm spex}(n,F)$. Given integers $n,k,p$ with $n>k>0$ and $0\leq p\leq \lfloor(n-k)/2\rfloor$, let $S_{n,k}^{p}$ be the graph obtained from $K_k\nabla(n-k)K_1$ by embedding $p$ independent edges within its independent set, where `$\nabla$' means the join product. For $n\geq\ell\geq 4$, let $G_{n,\ell}=S_{n,(\ell-2)/2}^{0}$ if $\ell$ is even, and $G_{n,\ell}=S_{n,(\ell-3)/2}^{1}$ if $\ell$ is odd. Cioabă, Desai and Tait [SIAM J. Discrete Math. 37 (3) (2023) 2228--2239] showed that for $\ell\geq 6$ and sufficiently large $n$, if $ρ(G)\geq ρ(G_{n,\ell})$, then $G$ contains all trees of order $\ell$ unless $G=G_{n,\ell}$. They further posed a problem to study ${\rm spex}(n,F)$ for various specific trees $F$. Fix a tree $F$ of order $\ell\geq 6$, let $A$ and $B$ be two partite sets of $F$ with $|A|\leq |B|$, and set $q=|A|-1$. We first show that any graph in ${\rm SPEX}(n,F)$ contains a spanning subgraph $K_{q,n-q}$ for $q\geq 1$ and sufficiently large $n$. Consequently, $ρ(K_{q,n-q})\leq {\rm spex}(n,F)\leq ρ(G_{n,\ell})$, we further respectively characterize all trees $F$ with these two equalities holding. Secondly, we characterize the spectral extremal graphs for some specific trees and provide asymptotic spectral extremal values of the remaining trees. In particular, we characterize the spectral extremal graphs for all spiders, surprisingly, the extremal graphs are not always the spanning subgraph of $G_{n,\ell}$.
△ Less
Submitted 18 January, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation
Authors:
Linshan Wu,
Ming Lu,
Leyuan Fang
Abstract:
Unsupervised domain adaptive (UDA) image segmentation has recently gained increasing attention, aiming to improve the generalization capability for transferring knowledge from the source domain to the target domain. However, in high spatial resolution remote sensing image (RSI), the same category from different domains (\emph{e.g.}, urban and rural) can appear to be totally different with extremel…
▽ More
Unsupervised domain adaptive (UDA) image segmentation has recently gained increasing attention, aiming to improve the generalization capability for transferring knowledge from the source domain to the target domain. However, in high spatial resolution remote sensing image (RSI), the same category from different domains (\emph{e.g.}, urban and rural) can appear to be totally different with extremely inconsistent distributions, which heavily limits the UDA accuracy. To address this problem, in this paper, we propose a novel Deep Covariance Alignment (DCA) model for UDA RSI segmentation. The DCA can explicitly align category features to learn shared domain-invariant discriminative feature representations, which enhances the ability of model generalization. Specifically, a Category Feature Pooling (CFP) module is first employed to extract category features by combining the coarse outputs and the deep features. Then, we leverage a novel Covariance Regularization (CR) to enforce the intra-category features to be closer and the inter-category features to be further separate. Compared with the existing category alignment methods, our CR aims to regularize the correlation between different dimensions of the features and thus performs more robustly when dealing with the divergent category features of imbalanced and inconsistent distributions. Finally, we propose a stagewise procedure to train the DCA in order to alleviate the error accumulation. Experiments on both Rural-to-Urban and Urban-to-Rural scenarios of the LoveDA dataset demonstrate the superiority of our proposed DCA over other state-of-the-art UDA segmentation methods. Code is available at https://github.com/Luffy03/DCA.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
FedDiff: Diffusion Model Driven Federated Learning for Multi-Modal and Multi-Clients
Authors:
DaiXun Li,
Weiying Xie,
ZiXuan Wang,
YiBing Lu,
Yunsong Li,
Leyuan Fang
Abstract:
With the rapid development of imaging sensor technology in the field of remote sensing, multi-modal remote sensing data fusion has emerged as a crucial research direction for land cover classification tasks. While diffusion models have made great progress in generative models and image classification tasks, existing models primarily focus on single-modality and single-client control, that is, the…
▽ More
With the rapid development of imaging sensor technology in the field of remote sensing, multi-modal remote sensing data fusion has emerged as a crucial research direction for land cover classification tasks. While diffusion models have made great progress in generative models and image classification tasks, existing models primarily focus on single-modality and single-client control, that is, the diffusion process is driven by a single modal in a single computing node. To facilitate the secure fusion of heterogeneous data from clients, it is necessary to enable distributed multi-modal control, such as merging the hyperspectral data of organization A and the LiDAR data of organization B privately on each base station client. In this study, we propose a multi-modal collaborative diffusion federated learning framework called FedDiff. Our framework establishes a dual-branch diffusion model feature extraction setup, where the two modal data are inputted into separate branches of the encoder. Our key insight is that diffusion models driven by different modalities are inherently complementary in terms of potential denoising steps on which bilateral connections can be built. Considering the challenge of private and efficient communication between multiple clients, we embed the diffusion model into the federated learning communication structure, and introduce a lightweight communication module. Qualitative and quantitative experiments validate the superiority of our framework in terms of image quality and conditional consistency.
△ Less
Submitted 15 November, 2023;
originally announced January 2024.
-
U-Mixer: An Unet-Mixer Architecture with Stationarity Correction for Time Series Forecasting
Authors:
Xiang Ma,
Xuemei Li,
Lexin Fang,
Tianlong Zhao,
Caiming Zhang
Abstract:
Time series forecasting is a crucial task in various domains. Caused by factors such as trends, seasonality, or irregular fluctuations, time series often exhibits non-stationary. It obstructs stable feature propagation through deep layers, disrupts feature distributions, and complicates learning data distribution changes. As a result, many existing models struggle to capture the underlying pattern…
▽ More
Time series forecasting is a crucial task in various domains. Caused by factors such as trends, seasonality, or irregular fluctuations, time series often exhibits non-stationary. It obstructs stable feature propagation through deep layers, disrupts feature distributions, and complicates learning data distribution changes. As a result, many existing models struggle to capture the underlying patterns, leading to degraded forecasting performance. In this study, we tackle the challenge of non-stationarity in time series forecasting with our proposed framework called U-Mixer. By combining Unet and Mixer, U-Mixer effectively captures local temporal dependencies between different patches and channels separately to avoid the influence of distribution variations among channels, and merge low- and high-levels features to obtain comprehensive data representations. The key contribution is a novel stationarity correction method, explicitly restoring data distribution by constraining the difference in stationarity between the data before and after model processing to restore the non-stationarity information, while ensuring the temporal dependencies are preserved. Through extensive experiments on various real-world time series datasets, U-Mixer demonstrates its effectiveness and robustness, and achieves 14.5\% and 7.7\% improvements over state-of-the-art (SOTA) methods.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Shape-programmable Adaptive Multi-material Microrobots for Biomedical Applications
Authors:
Liyuan Tan,
Yang Yang,
Li Fang,
David J. Cappelleri
Abstract:
Flagellated microorganisms can swim at low Reynolds numbers and adapt to changes in their environment. Specifically, the flagella can switch their shapes or modes through gene expression. In the past decade, efforts have been made to fabricate and investigate rigid types of microrobots without any adaptation to the environments. More recently, obtaining adaptive microrobots mimicking real microorg…
▽ More
Flagellated microorganisms can swim at low Reynolds numbers and adapt to changes in their environment. Specifically, the flagella can switch their shapes or modes through gene expression. In the past decade, efforts have been made to fabricate and investigate rigid types of microrobots without any adaptation to the environments. More recently, obtaining adaptive microrobots mimicking real microorganisms is getting more attention. However, even though some adaptive microrobots achieved by hydrogels have emerged, the swimming behaviors of the microrobots before and after the environment-induced deformations are not predicted in a systematic standardized way. In this work, experiments, finite element analysis, and dynamic modeling are presented together to realize a complete understanding of these adaptive microrobots. The above three parts are cross-verified proving the success of using such methods, facilitating the bio-applications with shape-programmable and even swimming performance-programmable microrobots. Moreover, an application of targeted object delivery using the proposed microrobot has been successfully demonstrated. Finally, cytotoxicity tests are performed to prove the potential for using the proposed microrobot for biomedical applications.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
SSL-OTA: Unveiling Backdoor Threats in Self-Supervised Learning for Object Detection
Authors:
Qiannan Wang,
Changchun Yin,
Lu Zhou,
Liming Fang
Abstract:
The extensive adoption of Self-supervised learning(SSL) has led to an increased security threat from backdoor attacks. While existing research has mainly focused on backdoor attacks in image classification, there has been limited exploration of their implications for object detection. Object detection plays a critical role in security-sensitive applications, such as autonomous driving, where backd…
▽ More
The extensive adoption of Self-supervised learning(SSL) has led to an increased security threat from backdoor attacks. While existing research has mainly focused on backdoor attacks in image classification, there has been limited exploration of their implications for object detection. Object detection plays a critical role in security-sensitive applications, such as autonomous driving, where backdoor attacks seriously threaten human life and property. In this work, we propose the first backdoor attack designed for object detection tasks in SSL scenarios, called Object Transform Attack (SSL-OTA). SSL-OTA employs a trigger capable of altering predictions of the target object to the desired category, encompassing two attacks: Naive Attack(NA) and Dual-Source Blending Attack (DSBA). NA conducts data poisoning during downstream fine-tuning of the object detector, while DSBA additionally injects backdoors into the pre-trained encoder. We establish appropriate metrics and conduct extensive experiments on benchmark datasets, demonstrating the effectiveness of our proposed attack and its resistance to potential defenses. Notably, both NA and DSBA achieve high attack success rates (ASR) at extremely low poisoning rates (0.5%). The results underscore the importance of considering backdoor threats in SSL-based object detection and contribute a novel perspective to the field.
△ Less
Submitted 12 June, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
High Fidelity Human Trajectory Tracking Based on Surveillance Camera Data
Authors:
Zexu Li,
Lei Fang
Abstract:
Human crowds exhibit a wide range of interesting patterns, and measuring them is of great interest in areas ranging from psychology and social science to civil engineering. While \textit{in situ} measurements of human crowd patterns require large amounts of time and labor to obtain, human crowd experiments may result in statistics different from those that would emerge with a naturally emerging cr…
▽ More
Human crowds exhibit a wide range of interesting patterns, and measuring them is of great interest in areas ranging from psychology and social science to civil engineering. While \textit{in situ} measurements of human crowd patterns require large amounts of time and labor to obtain, human crowd experiments may result in statistics different from those that would emerge with a naturally emerging crowd. Here we present a simple, broadly applicable, highly accurate human crowd tracking technique to extract high-fidelity kinematic information from widely available surveillance camera videos. With the proposed technique, researchers can access scientific crowd data on a scale that is orders of magnitude larger than before. In addition to being able to measure an individual's time-resolved position and velocity, our technique also offers high validity time-resolved acceleration and step frequency, and step length. We demonstrate the applicability of our technique by applying it to surveillance camera videos in Tokyo Shinjuku streamed on YouTube and exploiting its high fidelity to expose the hidden contribution of walking speed variance at the crossroad. The high fidelity and simplicity of this powerful technique open up the way to utilize the large volume of existing surveillance camera data around the world for scientific studies.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Knowledge Distillation of LLM for Automatic Scoring of Science Education Assessments
Authors:
Ehsan Latif,
Luyang Fang,
** Ma,
Xiaoming Zhai
Abstract:
This study proposes a method for knowledge distillation (KD) of fine-tuned Large Language Models (LLMs) into smaller, more efficient, and accurate neural networks. We specifically target the challenge of deploying these models on resource-constrained devices. Our methodology involves training the smaller student model (Neural Network) using the prediction probabilities (as soft labels) of the LLM,…
▽ More
This study proposes a method for knowledge distillation (KD) of fine-tuned Large Language Models (LLMs) into smaller, more efficient, and accurate neural networks. We specifically target the challenge of deploying these models on resource-constrained devices. Our methodology involves training the smaller student model (Neural Network) using the prediction probabilities (as soft labels) of the LLM, which serves as a teacher model. This is achieved through a specialized loss function tailored to learn from the LLM's output probabilities, ensuring that the student model closely mimics the teacher's performance. To validate the performance of the KD approach, we utilized a large dataset, 7T, containing 6,684 student-written responses to science questions and three mathematical reasoning datasets with student-written responses graded by human experts. We compared accuracy with state-of-the-art (SOTA) distilled models, TinyBERT, and artificial neural network (ANN) models. Results have shown that the KD approach has 3% and 2% higher scoring accuracy than ANN and TinyBERT, respectively, and comparable accuracy to the teacher model. Furthermore, the student model size is 0.03M, 4,000 times smaller in parameters and x10 faster in inferencing than the teacher model and TinyBERT, respectively. The significance of this research lies in its potential to make advanced AI technologies accessible in typical educational settings, particularly for automatic scoring.
△ Less
Submitted 11 June, 2024; v1 submitted 25 December, 2023;
originally announced December 2023.
-
A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation
Authors:
Haofeng Yuan,
Lichang Fang,
Shiji Song
Abstract:
Column generation (CG) is one of the most successful approaches for solving large-scale linear programming (LP) problems. Given an LP with a prohibitively large number of variables (i.e., columns), the idea of CG is to explicitly consider only a subset of columns and iteratively add potential columns to improve the objective value. While adding the column with the most negative reduced cost can gu…
▽ More
Column generation (CG) is one of the most successful approaches for solving large-scale linear programming (LP) problems. Given an LP with a prohibitively large number of variables (i.e., columns), the idea of CG is to explicitly consider only a subset of columns and iteratively add potential columns to improve the objective value. While adding the column with the most negative reduced cost can guarantee the convergence of CG, it has been shown that adding multiple columns per iteration rather than a single column can lead to faster convergence. However, it remains a challenge to design a multiple-column selection strategy to select the most promising columns from a large number of candidate columns. In this paper, we propose a novel reinforcement-learning-based (RL) multiple-column selection strategy. To the best of our knowledge, it is the first RL-based multiple-column selection strategy for CG. The effectiveness of our approach is evaluated on two sets of problems: the cutting stock problem and the graph coloring problem. Compared to several widely used single-column and multiple-column selection strategies, our RL-based multiple-column selection strategy leads to faster convergence and achieves remarkable reductions in the number of CG iterations and runtime.
△ Less
Submitted 28 December, 2023; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Densify Your Labels: Unsupervised Clustering with Bipartite Matching for Weakly Supervised Point Cloud Segmentation
Authors:
Shaobo Xia,
Jun Yue,
Kacper Kania,
Leyuan Fang,
Andrea Tagliasacchi,
Kwang Moo Yi,
Weiwei Sun
Abstract:
We propose a weakly supervised semantic segmentation method for point clouds that predicts "per-point" labels from just "whole-scene" annotations while achieving the performance of recent fully supervised approaches. Our core idea is to propagate the scene-level labels to each point in the point cloud by creating pseudo labels in a conservative way. Specifically, we over-segment point cloud featur…
▽ More
We propose a weakly supervised semantic segmentation method for point clouds that predicts "per-point" labels from just "whole-scene" annotations while achieving the performance of recent fully supervised approaches. Our core idea is to propagate the scene-level labels to each point in the point cloud by creating pseudo labels in a conservative way. Specifically, we over-segment point cloud features via unsupervised clustering and associate scene-level labels with clusters through bipartite matching, thus propagating scene labels only to the most relevant clusters, leaving the rest to be guided solely via unsupervised clustering. We empirically demonstrate that over-segmentation and bipartite assignment plays a crucial role. We evaluate our method on ScanNet and S3DIS datasets, outperforming state of the art, and demonstrate that we can achieve results comparable to fully supervised methods.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Physics Inspired Criterion for Pruning-Quantization Joint Learning
Authors:
Weiying Xie,
Xiaoyi Fan,
Xin Zhang,
Yunsong Li,
Jie Lei,
Leyuan Fang
Abstract:
Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an…
▽ More
Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an analogy we first draw between elasticity dynamics (ED) and model compression (MC). Specifically, derived from Hooke's law in ED, we establish a linear relationship between the filters' importance distribution and the filter property (FP) by a learnable deformation scale in the physics inspired criterion (PIC). Furthermore, we extend PIC with a relative shift variable for a global view. To ensure feasibility and flexibility, available maximum bitwidth and penalty factor are introduced in quantization bitwidth assignment. Experiments on benchmarks of image classification demonstrate that PIC-PQ yields a good trade-off between accuracy and bit-operations (BOPs) compression ratio e.g., 54.96X BOPs compression ratio in ResNet56 on CIFAR10 with 0.10% accuracy drop and 53.24X in ResNet18 on ImageNet with 0.61% accuracy drop). The code will be available at https://github.com/fanxxxxyi/PIC-PQ.
△ Less
Submitted 4 June, 2024; v1 submitted 1 December, 2023;
originally announced December 2023.
-
Variants of Tagged Sentential Decision Diagrams
Authors:
Deyuan Zhong,
Mingwei Zhang,
Quanlong Guan,
Liangda Fang,
Zhaorong Lai,
Yong Lai
Abstract:
A recently proposed canonical form of Boolean functions, namely tagged sentential decision diagrams (TSDDs), exploits both the standard and zero-suppressed trimming rules. The standard ones minimize the size of sentential decision diagrams (SDDs) while the zero-suppressed trimming rules have the same objective as the standard ones but for zero-suppressed sentential decision diagrams (ZSDDs). The o…
▽ More
A recently proposed canonical form of Boolean functions, namely tagged sentential decision diagrams (TSDDs), exploits both the standard and zero-suppressed trimming rules. The standard ones minimize the size of sentential decision diagrams (SDDs) while the zero-suppressed trimming rules have the same objective as the standard ones but for zero-suppressed sentential decision diagrams (ZSDDs). The original TSDDs, which we call zero-suppressed TSDDs (ZTSDDs), firstly fully utilize the zero-suppressed trimming rules, and then the standard ones. In this paper, we present a variant of TSDDs which we call standard TSDDs (STSDDs) by reversing the order of trimming rules. We then prove the canonicity of STSDDs and present the algorithms for binary operations on TSDDs. In addition, we offer two kinds of implementations of STSDDs and ZTSDDs and acquire three variations of the original TSDDs. Experimental evaluations demonstrate that the four versions of TSDDs have the size advantage over SDDs and ZSDDs.
△ Less
Submitted 16 November, 2023;
originally announced December 2023.
-
OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning
Authors:
Haiyang Ying,
Yixuan Yin,
**zhi Zhang,
Fan Wang,
Tao Yu,
Ruqi Huang,
Lu Fang
Abstract:
Towards holistic understanding of 3D scenes, a general 3D segmentation method is needed that can segment diverse objects without restrictions on object quantity or categories, while also reflecting the inherent hierarchical structure. To achieve this, we propose OmniSeg3D, an omniversal segmentation method aims for segmenting anything in 3D all at once. The key insight is to lift multi-view incons…
▽ More
Towards holistic understanding of 3D scenes, a general 3D segmentation method is needed that can segment diverse objects without restrictions on object quantity or categories, while also reflecting the inherent hierarchical structure. To achieve this, we propose OmniSeg3D, an omniversal segmentation method aims for segmenting anything in 3D all at once. The key insight is to lift multi-view inconsistent 2D segmentations into a consistent 3D feature field through a hierarchical contrastive learning framework, which is accomplished by two steps. Firstly, we design a novel hierarchical representation based on category-agnostic 2D segmentations to model the multi-level relationship among pixels. Secondly, image features rendered from the 3D feature field are clustered at different levels, which can be further drawn closer or pushed apart according to the hierarchical relationship between different levels. In tackling the challenges posed by inconsistent 2D segmentations, this framework yields a global consistent 3D feature field, which further enables hierarchical segmentation, multi-object selection, and global discretization. Extensive experiments demonstrate the effectiveness of our method on high-quality 3D segmentation and accurate hierarchical structure understanding. A graphical user interface further facilitates flexible interaction for omniversal 3D segmentation.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
FedFusion: Manifold Driven Federated Learning for Multi-satellite and Multi-modality Fusion
Authors:
DaiXun Li,
Weiying Xie,
Yunsong Li,
Leyuan Fang
Abstract:
Multi-satellite, multi-modality in-orbit fusion is a challenging task as it explores the fusion representation of complex high-dimensional data under limited computational resources. Deep neural networks can reveal the underlying distribution of multi-modal remote sensing data, but the in-orbit fusion of multimodal data is more difficult because of the limitations of different sensor imaging chara…
▽ More
Multi-satellite, multi-modality in-orbit fusion is a challenging task as it explores the fusion representation of complex high-dimensional data under limited computational resources. Deep neural networks can reveal the underlying distribution of multi-modal remote sensing data, but the in-orbit fusion of multimodal data is more difficult because of the limitations of different sensor imaging characteristics, especially when the multimodal data follows non-independent identically distribution (Non-IID) distributions. To address this problem while maintaining classification performance, this paper proposes a manifold-driven multi-modality fusion framework, FedFusion, which randomly samples local data on each client to jointly estimate the prominent manifold structure of shallow features of each client and explicitly compresses the feature matrices into a low-rank subspace through cascading and additive approaches, which is used as the feature input of the subsequent classifier. Considering the physical space limitations of the satellite constellation, we developed a multimodal federated learning module designed specifically for manifold data in a deep latent space. This module achieves iterative updating of the sub-network parameters of each client through global weighted averaging, constructing a framework that can represent compact representations of each client. The proposed framework surpasses existing methods in terms of performance on three multimodal datasets, achieving a classification average accuracy of 94.35$\%$ while compressing communication costs by a factor of 4. Furthermore, extensive numerical evaluations of real-world satellite images were conducted on the orbiting edge computing architecture based on Jetson TX2 industrial modules, which demonstrated that FedFusion significantly reduced training time by 48.4 minutes (15.18%) while optimizing accuracy.}
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Retro-BLEU: Quantifying Chemical Plausibility of Retrosynthesis Routes through Reaction Template Sequence Analysis
Authors:
Junren Li,
Lei Fang,
Jian-Guang Lou
Abstract:
Computer-assisted methods have emerged as valuable tools for retrosynthesis analysis. However, quantifying the plausibility of generated retrosynthesis routes remains a challenging task. We introduce Retro-BLEU, a statistical metric adapted from the well-established BLEU score in machine translation, to evaluate the plausibility of retrosynthesis routes based on reaction template sequences analysi…
▽ More
Computer-assisted methods have emerged as valuable tools for retrosynthesis analysis. However, quantifying the plausibility of generated retrosynthesis routes remains a challenging task. We introduce Retro-BLEU, a statistical metric adapted from the well-established BLEU score in machine translation, to evaluate the plausibility of retrosynthesis routes based on reaction template sequences analysis. We demonstrate the effectiveness of Retro-BLEU by applying it to a diverse set of retrosynthesis routes generated by state-of-the-art algorithms and compare the performance with other evaluation metrics. The results show that Retro-BLEU is capable of differentiating between plausible and implausible routes. Furthermore, we provide insights into the strengths and weaknesses of Retro-BLEU, paving the way for future developments and improvements in this field.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Dark-Field X-ray Microscopy for 2D and 3D imaging of Microstructural Dynamics at the European X-ray Free Electron Laser
Authors:
Sara J. Irvine,
Kento Katagiri,
Trygve M. Ræder,
Darshan Chalise,
Dayeeta Pal,
Jade I. Stanton,
Gabriele Ansaldi,
Ulrike Boesenberg,
Felix Brauße,
Jon H. Eggert,
Lichao Fang,
Eric Folsom,
Jörg Hallmann,
Morten Haubro,
Theodor S. Holstad,
Anders Madsen,
Johannes Möller,
Martin M. Nielsen,
Henning F. Poulsen,
Jan-Etienne Pudel,
Angel Rodriguez-Fernandez,
Frank Schoofs,
Frank Seiboth,
Yifan Wang,
Jo Wonhyuk
, et al. (4 additional authors not shown)
Abstract:
Dark field X-ray microscopy (DXFM) has enabled experiments to visualize microstructural distortions in bulk crystals. Using the femtosecond X-ray pulses generated by X-ray free-electron lasers (XFEL), DFXM can achieve ~1-um spatial resolution and <100 fs time resolution simultaneously. In this paper, we present the first ultrafast DFXM measurements at the European XFEL. In this work, we demonstrat…
▽ More
Dark field X-ray microscopy (DXFM) has enabled experiments to visualize microstructural distortions in bulk crystals. Using the femtosecond X-ray pulses generated by X-ray free-electron lasers (XFEL), DFXM can achieve ~1-um spatial resolution and <100 fs time resolution simultaneously. In this paper, we present the first ultrafast DFXM measurements at the European XFEL. In this work, we demonstrate DFXM of laser-induced phonon wavepackets propagating through dislocations inside a diamond single crystal. In addition to demonstrating this new capability, we present two new DFXM scanning techniques for XFEL applications, 3D and axial-strain scans with sub-μm spatial resolution. With this progress to XFEL DFXM, we discuss new opportunities to study multi-timescale spatio-temporal dynamics of defects, strain waves, and other localized phenomena deep inside crystals.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
An Investigation of Darwiche and Pearl's Postulates for Iterated Belief Update
Authors:
Quanlong Guan,
Tong Zhu,
Liangda Fang,
Junming Qiu,
Zhao-Rong Lai,
Weiqi Luo
Abstract:
Belief revision and update, two significant types of belief change, both focus on how an agent modify her beliefs in presence of new information. The most striking difference between them is that the former studies the change of beliefs in a static world while the latter concentrates on a dynamically-changing world. The famous AGM and KM postulates were proposed to capture rational belief revision…
▽ More
Belief revision and update, two significant types of belief change, both focus on how an agent modify her beliefs in presence of new information. The most striking difference between them is that the former studies the change of beliefs in a static world while the latter concentrates on a dynamically-changing world. The famous AGM and KM postulates were proposed to capture rational belief revision and update, respectively. However, both of them are too permissive to exclude some unreasonable changes in the iteration. In response to this weakness, the DP postulates and its extensions for iterated belief revision were presented. Furthermore, Rodrigues integrated these postulates in belief update. Unfortunately, his approach does not meet the basic requirement of iterated belief update. This paper is intended to solve this problem of Rodrigues's approach. Firstly, we present a modification of the original KM postulates based on belief states. Subsequently, we migrate several well-known postulates for iterated belief revision to iterated belief update. Moreover, we provide the exact semantic characterizations based on partial preorders for each of the proposed postulates. Finally, we analyze the compatibility between the above iterated postulates and the KM postulates for belief update.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Using GPT-4 to Augment Unbalanced Data for Automatic Scoring
Authors:
Luyang Fang,
Gyeong-Geon Lee,
Xiaoming Zhai
Abstract:
Machine learning-based automatic scoring can be challenging if students' responses are unbalanced across scoring categories, as it introduces uncertainty in the machine training process. To meet this challenge, we introduce a novel text data augmentation framework using GPT-4, a generative large language model, specifically tailored for unbalanced datasets in automatic scoring. Our experimental da…
▽ More
Machine learning-based automatic scoring can be challenging if students' responses are unbalanced across scoring categories, as it introduces uncertainty in the machine training process. To meet this challenge, we introduce a novel text data augmentation framework using GPT-4, a generative large language model, specifically tailored for unbalanced datasets in automatic scoring. Our experimental dataset comprised student-written responses to two science items. We crafted prompts for GPT-4 to generate responses resembling student-written answers, particularly for the minority scoring classes, to augment the data. We then finetuned DistillBERT for automatic scoring based on the augmented and original datasets. Model performance was assessed using accuracy, precision, recall, and F1 score. We incorporate varied amounts of augmented data to examine scoring performance, and our findings revealed remarkedly improved model performance. The average maximum increase observed across two items is: 3.5% for accuracy, 30.6% for precision, 21.1% for recall, and 24.2% for F1 score. Notably, using just 5% of the augmented data led to substantial improvements: 2.6%, 29.2%, 15.1%, and 19.6%. Interestingly, the extent of improvement varied depending on specific datasets. Moreover, we found that a varying amount of augmented data (5%-40%) was needed to obtain a stable improvement. We also compare models trained with GPT-4 augmented data and those trained with additional student-written responses. The findings indicate that former ones match or even exceed the performance of the latter. Specifically, there is an average difference of 1.7%, 1.9%, 11.0%, and 7.8% for four metrics separately. This research underscores the potential and effectiveness of data augmentation techniques utilizing GPT-4 in addressing unbalanced datasets within automated assessment.
△ Less
Submitted 17 November, 2023; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Numerical algorithm and complexity analysis for diagonalization of multivariate homogeneous polynomials
Authors:
Lishan Fang,
Hua-Lin Huang,
Yuechen Li
Abstract:
We study the computational complexity of a criterion and an algorithm for diagonalization of multivariate homogeneous polynomials, that is, expressing them as sums of powers of independent linear forms. They are based on Harrison's center theory and only require solving linear and quadratic systems of equations. Detailed descriptions and computational complexity of each step of the algorithm are p…
▽ More
We study the computational complexity of a criterion and an algorithm for diagonalization of multivariate homogeneous polynomials, that is, expressing them as sums of powers of independent linear forms. They are based on Harrison's center theory and only require solving linear and quadratic systems of equations. Detailed descriptions and computational complexity of each step of the algorithm are provided. The complexity analysis focuses on the impacts of problem sizes, including the number of variables and the degree of given polynomials. We show that this algorithm runs in polynomial time and validate it through numerical experiments. Other diagonalization algorithms are reviewed and compared in terms of complexity.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement
Authors:
Rui Yang,
Jiahao Zhu,
Jian** Man,
Li Fang,
Yi Zhou
Abstract:
The design and development of text-based knowledge graph completion (KGC) methods leveraging textual entity descriptions are at the forefront of research. These methods involve advanced optimization techniques such as soft prompts and contrastive learning to enhance KGC models. The effectiveness of text-based methods largely hinges on the quality and richness of the training data. Large language m…
▽ More
The design and development of text-based knowledge graph completion (KGC) methods leveraging textual entity descriptions are at the forefront of research. These methods involve advanced optimization techniques such as soft prompts and contrastive learning to enhance KGC models. The effectiveness of text-based methods largely hinges on the quality and richness of the training data. Large language models (LLMs) can utilize straightforward prompts to alter text data, thereby enabling data augmentation for KGC. Nevertheless, LLMs typically demand substantial computational resources. To address these issues, we introduce a framework termed constrained prompts for KGC (CP-KGC). This CP-KGC framework designs prompts that adapt to different datasets to enhance semantic richness. Additionally, CP-KGC employs a context constraint strategy to effectively identify polysemous entities within KGC datasets. Through extensive experimentation, we have verified the effectiveness of this framework. Even after quantization, the LLM (Qwen-7B-Chat-int4) still enhances the performance of text-based KGC methods \footnote{Code and datasets are available at \href{https://github.com/sjlmg/CP-KGC}{https://github.com/sjlmg/CP-KGC}}. This study extends the performance limits of existing models and promotes further integration of KGC with LLMs.
△ Less
Submitted 27 June, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and Multispectral Data Fusion
Authors:
**ghui Qin,
Lihuang Fang,
Ruitao Lu,
Liang Lin,
Yukai Shi
Abstract:
Deep learning-based hyperspectral image (HSI) super-resolution, which aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs), has attracted lots of attention. However, neural networks require large amounts of training data, hindering their application in real-world scenarios. In this letter, we propos…
▽ More
Deep learning-based hyperspectral image (HSI) super-resolution, which aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs), has attracted lots of attention. However, neural networks require large amounts of training data, hindering their application in real-world scenarios. In this letter, we propose a novel adversarial automatic data augmentation framework ADASR that automatically optimizes and augments HSI-MSI sample pairs to enrich data diversity for HSI-MSI fusion. Our framework is sample-aware and optimizes an augmentor network and two downsampling networks jointly by adversarial learning so that we can learn more robust downsampling networks for training the upsampling network. Extensive experiments on two public classical hyperspectral datasets demonstrate the effectiveness of our ADASR compared to the state-of-the-art methods.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Spectral extremal results on edge blow-up of graphs
Authors:
Longfei Fang,
Huiqiu Lin
Abstract:
Let ${\rm ex}(n,F)$ and ${\rm spex}(n,F)$ be the maximum size and maximum spectral radius of an $F$-free graph of order $n$, respectively. The value ${\rm spex}(n,F)$ is called the spectral extremal value of $F$. Nikiforov [J. Graph Theory 62 (2009) 362--368] gave the spectral Stability Lemma, which implies that for every $\varepsilon>0$, sufficiently large $n$ and a non-bipartite graph $H$ with c…
▽ More
Let ${\rm ex}(n,F)$ and ${\rm spex}(n,F)$ be the maximum size and maximum spectral radius of an $F$-free graph of order $n$, respectively. The value ${\rm spex}(n,F)$ is called the spectral extremal value of $F$. Nikiforov [J. Graph Theory 62 (2009) 362--368] gave the spectral Stability Lemma, which implies that for every $\varepsilon>0$, sufficiently large $n$ and a non-bipartite graph $H$ with chromatic number $χ(H)$, the extremal graph for ${\rm spex}(n,H)$ can be obtained from the Turán graph $T_{χ(H)-1}(n)$ by adding and deleting at most $\varepsilon n^2$ edges. It is still a challenging problem to determine the exact spectral extremal values of many non-bipartite graphs. Given a graph $F$ and an integer $p\geq 2$, the edge blow-up of $F$, denoted by $F^{p+1}$, is the graph obtained from replacing each edge in $F$ by a $K_{p+1}$ where the new vertices of $K_{p+1}$ are all distinct. In this paper, we determine the exact spectral extremal values of the edge blow-up of all non-bipartite graphs and provide the asymptotic spectral extremal values of the edge blow-up of all bipartite graphs for sufficiently large $n$, which can be seen as a spectral version of the theorem on ${\rm ex}(n,F^{p+1})$ given by Yuan [J. Combin. Theory Ser. B 152 (2022) 379--398]. As applications, on the one hand, we generalize several previous results on ${\rm spex}(n,F^{p+1})$ for $F$ being a matching and a star for $p\geq 3$. On the other hand, we obtain the exact values of ${\rm spex}(n,F^{p+1})$ for $F$ being a path, a cycle and a complete graph.
△ Less
Submitted 18 December, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Interaction between swarming active matter and flow: the impact on Lagrangian coherent structures
Authors:
Xinyu Si,
Lei Fang
Abstract:
In recent years, research topics concerning active matter have attracted interest from diverse communities. It has been suggested that active matter-as represented by zooplankton-has potential in ocean mixing due to its intrinsic mobility and the sheer amount of biomass. However, prior investigations have predominantly overlooked the influence of external background flow, despite the ubiquity of f…
▽ More
In recent years, research topics concerning active matter have attracted interest from diverse communities. It has been suggested that active matter-as represented by zooplankton-has potential in ocean mixing due to its intrinsic mobility and the sheer amount of biomass. However, prior investigations have predominantly overlooked the influence of external background flow, despite the ubiquity of flow driven by various sources in nature. The interaction between active matter and external flow structures has long been neglected. Here, we conducted experiments using a typical centimeter swimmer, \textit{A. salina}, and an electromagnetically driven quasi-two-dimensional flow to study the interaction between active matter and flow. We focused on the impact of swarming active matter on hyperbolic Lagrangian coherent structures (LCSs) that mark the most straining regions in the flow. We illustrated that the impact of active matter on LCSs was much more significant compared to localized random noise with similar energy input. In addition, we revealed that the perturbation generated by active matter could couple with the background flow and further deform the LCSs. In addition, we also revealed that the rotational elliptical region of the flow was much more susceptible to active matter perturbation. We further described how the influence of active matter changed with their number densities and background flow intensities. We revealed that the LCSs could be decently altered even at a small number density of active matter. We aim to provide valuable insights and draw attention to the problem regarding the interaction between active matter and external flow structures.
△ Less
Submitted 8 November, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.