-
Denoising-Aware Contrastive Learning for Noisy Time Series
Authors:
Shuang Zhou,
Daochen Zha,
Xiao Shen,
Xiao Huang,
Rui Zhang,
Fu-Lai Chung
Abstract:
Time series self-supervised learning (SSL) aims to exploit unlabeled data for pre-training to mitigate the reliance on labels. Despite the great success in recent years, there is limited discussion on the potential noise in the time series, which can severely impair the performance of existing SSL methods. To mitigate the noise, the de facto strategy is to apply conventional denoising methods befo…
▽ More
Time series self-supervised learning (SSL) aims to exploit unlabeled data for pre-training to mitigate the reliance on labels. Despite the great success in recent years, there is limited discussion on the potential noise in the time series, which can severely impair the performance of existing SSL methods. To mitigate the noise, the de facto strategy is to apply conventional denoising methods before model training. However, this pre-processing approach may not fully eliminate the effect of noise in SSL for two reasons: (i) the diverse types of noise in time series make it difficult to automatically determine suitable denoising methods; (ii) noise can be amplified after map** raw data into latent space. In this paper, we propose denoising-aware contrastive learning (DECL), which uses contrastive learning objectives to mitigate the noise in the representation and automatically selects suitable denoising methods for every sample. Extensive experiments on various datasets verify the effectiveness of our method. The code is open-sourced.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Quark flavor violation and axion-like particles from top-quark decays at the LHC
Authors:
Kingman Cheung,
Fei-Tung Chung,
Giovanna Cottin,
Zeren Simon Wang
Abstract:
We study axion-like particles (ALPs) with quark-flavor-violating couplings at the LHC. Specifically, we focus on the theoretical scenario with ALP-top-up and ALP-top-charm interactions, in addition to the more common quark-flavor-diagonal couplings. The ALPs can thus originate from decays of top quarks which are pair produced in large numbers at the LHC, and then decay to jets. If these couplings…
▽ More
We study axion-like particles (ALPs) with quark-flavor-violating couplings at the LHC. Specifically, we focus on the theoretical scenario with ALP-top-up and ALP-top-charm interactions, in addition to the more common quark-flavor-diagonal couplings. The ALPs can thus originate from decays of top quarks which are pair produced in large numbers at the LHC, and then decay to jets. If these couplings to the quarks are tiny and the ALPs have $\mathcal{O}(10)$ GeV masses, they are long-lived, leading to signatures of displaced vertex plus multiple jets, which have the advantage of suppression of background events at the LHC. We recast a recent ATLAS search for the same signature and reinterpret the results in terms of bounds on the long-lived ALP in our theoretical scenario. We find that the LHC with the full Run 2 dataset can place stringent limits, while at the future high-luminosity LHC with 3 ab$^{-1}$ integrated luminosity stronger sensitivities are expected.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
TransFlower: An Explainable Transformer-Based Model with Flow-to-Flow Attention for Commuting Flow Prediction
Authors:
Yan Luo,
Zhuoyue Wan,
Yuzhong Chen,
Gengchen Mai,
Fu-lai Chung,
Kent Larson
Abstract:
Understanding the link between urban planning and commuting flows is crucial for guiding urban development and policymaking. This research, bridging computer science and urban studies, addresses the challenge of integrating these fields with their distinct focuses. Traditional urban studies methods, like the gravity and radiation models, often underperform in complex scenarios due to their limited…
▽ More
Understanding the link between urban planning and commuting flows is crucial for guiding urban development and policymaking. This research, bridging computer science and urban studies, addresses the challenge of integrating these fields with their distinct focuses. Traditional urban studies methods, like the gravity and radiation models, often underperform in complex scenarios due to their limited handling of multiple variables and reliance on overly simplistic and unrealistic assumptions, such as spatial isotropy. While deep learning models offer improved accuracy, their black-box nature poses a trade-off between performance and explainability -- both vital for analyzing complex societal phenomena like commuting flows. To address this, we introduce TransFlower, an explainable, transformer-based model employing flow-to-flow attention to predict urban commuting patterns. It features a geospatial encoder with an anisotropy-aware relative location encoder for nuanced flow representation. Following this, the transformer-based flow predictor enhances this by leveraging attention mechanisms to efficiently capture flow interactions. Our model outperforms existing methods by up to 30.8% Common Part of Commuters, offering insights into mobility dynamics crucial for urban planning and policy decisions.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation
Authors:
Shuang Zhou,
Xiao Huang,
Ninghao Liu,
Huachi Zhou,
Fu-Lai Chung,
Long-Kai Huang
Abstract:
Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack…
▽ More
Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack labels to train an effective detection model. One natural idea is to directly adopt a trained GAD model to the new (sub)graph for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issue, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the same graph. It may cause great troubles. In this paper, we base on the phenomenon and propose a general and novel research problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph and unseen testing graph to eliminate potential dangers. Nevertheless, it is a challenging task since only limited labels are available, and the normal background may differ between training and testing data. Accordingly, we propose a data augmentation method named \textit{AugAN} (\uline{Aug}mentation for \uline{A}nomaly and \uline{N}ormal distributions) to enrich training data and boost the generalizability of GAD models. Experiments verify the effectiveness of our method in improving model generalizability.
△ Less
Submitted 18 June, 2023;
originally announced June 2023.
-
Timestamps as Prompts for Geography-Aware Location Recommendation
Authors:
Yan Luo,
Haoyi Duan,
Ye Liu,
Fu-lai Chung
Abstract:
Location recommendation plays a vital role in improving users' travel experience. The timestamp of the POI to be predicted is of great significance, since a user will go to different places at different times. However, most existing methods either do not use this kind of temporal information, or just implicitly fuse it with other contextual information. In this paper, we revisit the problem of loc…
▽ More
Location recommendation plays a vital role in improving users' travel experience. The timestamp of the POI to be predicted is of great significance, since a user will go to different places at different times. However, most existing methods either do not use this kind of temporal information, or just implicitly fuse it with other contextual information. In this paper, we revisit the problem of location recommendation and point out that explicitly modeling temporal information is a great help when the model needs to predict not only the next location but also further locations. In addition, state-of-the-art methods do not make effective use of geographic information and suffer from the hard boundary problem when encoding geographic information by gridding. To this end, a Temporal Prompt-based and Geography-aware (TPG) framework is proposed. The temporal prompt is firstly designed to incorporate temporal information of any further check-in. A shifted window mechanism is then devised to augment geographic data for addressing the hard boundary problem. Via extensive comparisons with existing methods and ablation studies on five real-world datasets, we demonstrate the effectiveness and superiority of the proposed method under various settings. Most importantly, our proposed model has the superior ability of interval prediction. In particular, the model can predict the location that a user wants to go to at a certain time while the most recent check-in behavioral data is masked, or it can predict specific future check-in (not just the next one) at a given timestamp.
△ Less
Submitted 8 April, 2023;
originally announced April 2023.
-
End-to-End Personalized Next Location Recommendation via Contrastive User Preference Modeling
Authors:
Yan Luo,
Ye Liu,
Fu-lai Chung,
Yu Liu,
Chang Wen Chen
Abstract:
Predicting the next location is a highly valuable and common need in many location-based services such as destination prediction and route planning. The goal of next location recommendation is to predict the next point-of-interest a user might go to based on the user's historical trajectory. Most existing models learn mobility patterns merely from users' historical check-in sequences while overloo…
▽ More
Predicting the next location is a highly valuable and common need in many location-based services such as destination prediction and route planning. The goal of next location recommendation is to predict the next point-of-interest a user might go to based on the user's historical trajectory. Most existing models learn mobility patterns merely from users' historical check-in sequences while overlooking the significance of user preference modeling. In this work, a novel Point-of-Interest Transformer (POIFormer) with contrastive user preference modeling is developed for end-to-end next location recommendation. This model consists of three major modules: history encoder, query generator, and preference decoder. History encoder is designed to model mobility patterns from historical check-in sequences, while query generator explicitly learns user preferences to generate user-specific intention queries. Finally, preference decoder combines the intention queries and historical information to predict the user's next location. Extensive comparisons with representative schemes and ablation studies on four real-world datasets demonstrate the effectiveness and superiority of the proposed scheme under various settings.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation
Authors:
Shuang Zhou,
Xiao Huang,
Ninghao Liu,
Fu-Lai Chung,
Long-Kai Huang
Abstract:
Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack…
▽ More
Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack labels to train an effective detection model. One natural idea is to directly adopt a trained GAD model to the new (sub)graph for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issue, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the same graph. It may cause great troubles. In this paper, we base on the phenomenon and propose a general and novel research problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph and unseen testing graph to eliminate potential dangers. Nevertheless, it is a challenging task since only limited labels are available, and the normal background may differ between training and testing data. Accordingly, we propose a data augmentation method named \textit{AugAN} (\uline{Aug}mentation for \uline{A}nomaly and \uline{N}ormal distributions) to enrich training data and boost the generalizability of GAD models. Experiments verify the effectiveness of our method in improving model generalizability.
△ Less
Submitted 23 July, 2023; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Quasi-Random Influences of Boolean Functions
Authors:
Fan Chung,
Nicholas Sieger
Abstract:
We examine a hierarchy of equivalence classes of quasi-random properties of Boolean Functions. In particular, we prove an equivalence between a number of properties including balanced influences, spectral discrepancy, local strong regularity, homomorphism enumerations of colored or weighted graphs and hypergraphs associated with Boolean functions as well as the $k$th-order strict avalanche criteri…
▽ More
We examine a hierarchy of equivalence classes of quasi-random properties of Boolean Functions. In particular, we prove an equivalence between a number of properties including balanced influences, spectral discrepancy, local strong regularity, homomorphism enumerations of colored or weighted graphs and hypergraphs associated with Boolean functions as well as the $k$th-order strict avalanche criterion amongst others. We further construct families of quasi-random boolean functions which exhibit the properties of our equivalence theorem and separate the levels of our hierarchy.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Fan-complete Ramsey numbers
Authors:
Fan Chung,
Qizhong Lin
Abstract:
We consider Ramsey numbers $r(G,H)$ with tight lower bounds, namely, \begin{align*} r(G,H) \geq (χ(G)-1)(|H|-1)+1, \end{align*} where $χ(G)$ denotes the chromatic number of $G$ and $|H|$ denotes the number of vertices in $H$. We say $H$ is $G$-good if the equality holds.
In this paper, we prove that the fan-graph $F_n=K_1 + n K_2$ is $K_p$-good if $n\geq 27p^2$, improving previous tower-type low…
▽ More
We consider Ramsey numbers $r(G,H)$ with tight lower bounds, namely, \begin{align*} r(G,H) \geq (χ(G)-1)(|H|-1)+1, \end{align*} where $χ(G)$ denotes the chromatic number of $G$ and $|H|$ denotes the number of vertices in $H$. We say $H$ is $G$-good if the equality holds.
In this paper, we prove that the fan-graph $F_n=K_1 + n K_2$ is $K_p$-good if $n\geq 27p^2$, improving previous tower-type lower bounds for $n$ due to Li and Rousseau (1996). The join graph $G+H$ is defined by adding all edges between the disjoint vertex sets of $G$ and $H$. Let $nH$ denote the union graph of $n$ disjoint copies of $H$. We show that $K_1+nH$ is $K_p$-good if $n$ is sufficiently large. We give a stronger lower bound inequality for Ramsey number $r(G, K_1+F)$ for the case of $G=K_p(a_1, a_2, \dots, a_p)$, the complete $p$-partite graph with $a_1=1$ and $a_i \leq a_{i+1}$. In particular, using a stability-supersaturation lemma by Fox, He and Wigderson (2021), we show that for any fixed graph $H$, \begin{align*} r(G,K_1+nH) = \left\{ \begin{array}{ll} (p-1)(n |H|+a_2-1)+1 & \textrm{if $n|H|+a_2-1$ or $a_2-1$ is even,}\\ (p-1)(n |H|+a_2-2)+1 & \textrm{otherwise,} \end{array}
\right. \end{align*} where $G=K_p(1,a_2, \dots, a_p)$ with $a_i$'s satisfying some mild conditions and $n$ is sufficiently large. The special case of $H=K_1$ gives an answer to Burr's question (1981) about the discrepancy of $r(G, K_{1,n})$ from $G$-goodness for sufficiently large $n$. All bounds of $n$ we obtain are not of tower-types.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
Urban Region Profiling via A Multi-Graph Representation Learning Framework
Authors:
Y. Luo,
F. Chung,
K. Chen
Abstract:
Urban region profiling can benefit urban analytics. Although existing studies have made great efforts to learn urban region representation from multi-source urban data, there are still three limitations: (1) Most related methods focused merely on global-level inter-region relations while overlooking local-level geographical contextual signals and intra-region information; (2) Most previous works f…
▽ More
Urban region profiling can benefit urban analytics. Although existing studies have made great efforts to learn urban region representation from multi-source urban data, there are still three limitations: (1) Most related methods focused merely on global-level inter-region relations while overlooking local-level geographical contextual signals and intra-region information; (2) Most previous works failed to develop an effective yet integrated fusion module which can deeply fuse multi-graph correlations; (3) State-of-the-art methods do not perform well in regions with high variance socioeconomic attributes. To address these challenges, we propose a multi-graph representative learning framework, called Region2Vec, for urban region profiling. Specifically, except that human mobility is encoded for inter-region relations, geographic neighborhood is introduced for capturing geographical contextual information while POI side information is adopted for representing intra-region information by knowledge graph. Then, graphs are used to capture accessibility, vicinity, and functionality correlations among regions. To consider the discriminative properties of multiple graphs, an encoder-decoder multi-graph fusion module is further proposed to jointly learn comprehensive representations. Experiments on real-world datasets show that Region2Vec can be employed in three applications and outperforms all state-of-the-art baselines. Particularly, Region2Vec has better performance than previous studies in regions with high variance socioeconomic attributes.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Forest formulas of discrete Green's functions
Authors:
Fan Chung,
Ji Zeng
Abstract:
The discrete Green's functions are the pseudoinverse (or the inverse) of the Laplacian (or its variations) of a graph. In this paper, we will give combinatorial interpretations of Green's functions in terms of enumerating trees and forests in a graph that will be used to derive further formulas for several graph invariants. For example, we show that the trace of the Green's function $\mathbf{G}$ a…
▽ More
The discrete Green's functions are the pseudoinverse (or the inverse) of the Laplacian (or its variations) of a graph. In this paper, we will give combinatorial interpretations of Green's functions in terms of enumerating trees and forests in a graph that will be used to derive further formulas for several graph invariants. For example, we show that the trace of the Green's function $\mathbf{G}$ associated with the combinatorial Laplacian of a connected simple graph $Γ$ on $n$ vertices satisfies $\text{Tr}(\mathbf{G})=\sum_{λ_i \neq 0} \frac 1 {λ_i}= \frac{1}{nτ}|\mathbb{F}^*_2|$, where $λ_i$ denotes the eigenvalues of the combinatorial Laplacian, $τ$ denotes the number of spanning trees and $\mathbb{F}^*_2$ denotes the set of rooted spanning $2$-forests in $Γ$. We will prove forest formulas for discrete Green's functions for directed and weighted graphs and apply them to study random walks on graphs and digraphs. We derive a forest expression of the hitting time for digraphs, which gives combinatorial proofs to old and new results about hitting times, traces of discrete Green's functions, and other related quantities.
△ Less
Submitted 17 August, 2022; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Adaptation-Agnostic Meta-Training
Authors:
Jiaxin Chen,
Li-Ming Zhan,
Xiao-Ming Wu,
Fu-Lai Chung
Abstract:
Many meta-learning algorithms can be formulated into an interleaved process, in the sense that task-specific predictors are learned during inner-task adaptation and meta-parameters are updated during meta-update. The normal meta-training strategy needs to differentiate through the inner-task adaptation procedure to optimize the meta-parameters. This leads to a constraint that the inner-task algori…
▽ More
Many meta-learning algorithms can be formulated into an interleaved process, in the sense that task-specific predictors are learned during inner-task adaptation and meta-parameters are updated during meta-update. The normal meta-training strategy needs to differentiate through the inner-task adaptation procedure to optimize the meta-parameters. This leads to a constraint that the inner-task algorithms should be solved analytically. Under this constraint, only simple algorithms with analytical solutions can be applied as the inner-task algorithms, limiting the model expressiveness. To lift the limitation, we propose an adaptation-agnostic meta-training strategy. Following our proposed strategy, we can apply stronger algorithms (e.g., an ensemble of different types of algorithms) as the inner-task algorithm to achieve superior performance comparing with popular baselines. The source code is available at https://github.com/jiaxinchen666/AdaptationAgnosticMetaLearning.
△ Less
Submitted 24 August, 2021;
originally announced August 2021.
-
Mixed Set Domain Adaptation
Authors:
Sitong Mao,
Keli Zhang,
Fu-lai Chung
Abstract:
In the settings of conventional domain adaptation, categories of the source dataset are from the same domain (or domains for multi-source domain adaptation), which is not always true in reality. In this paper, we propose \textbf{\textit{Mixed Set Domain Adaptation} (MSDA)}. Under the settings of MSDA, different categories of the source dataset are not all collected from the same domain(s). For ins…
▽ More
In the settings of conventional domain adaptation, categories of the source dataset are from the same domain (or domains for multi-source domain adaptation), which is not always true in reality. In this paper, we propose \textbf{\textit{Mixed Set Domain Adaptation} (MSDA)}. Under the settings of MSDA, different categories of the source dataset are not all collected from the same domain(s). For instance, category $1\sim k$ are collected from domain $α$ while category $k+1\sim c$ are collected from domain $β$. Under such situation, domain adaptation performance will be further influenced because of the distribution discrepancy inside the source data. A feature element-wise weighting (FEW) method that can reduce distribution discrepancy between different categories is also proposed for MSDA. Experimental results and quality analysis show the significance of solving MSDA problem and the effectiveness of the proposed method.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Against Adversarial Learning: Naturally Distinguish Known and Unknown in Open Set Domain Adaptation
Authors:
Sitong Mao,
Xiao Shen,
Fu-lai Chung
Abstract:
Open set domain adaptation refers to the scenario that the target domain contains categories that do not exist in the source domain. It is a more common situation in the reality compared with the typical closed set domain adaptation where the source domain and the target domain contain the same categories. The main difficulty of open set domain adaptation is that we need to distinguish which targe…
▽ More
Open set domain adaptation refers to the scenario that the target domain contains categories that do not exist in the source domain. It is a more common situation in the reality compared with the typical closed set domain adaptation where the source domain and the target domain contain the same categories. The main difficulty of open set domain adaptation is that we need to distinguish which target data belongs to the unknown classes when machine learning models only have concepts about what they know. In this paper, we propose an "against adversarial learning" method that can distinguish unknown target data and known data naturally without setting any additional hyper parameters and the target data predicted to the known classes can be classified at the same time. Experimental results show that the proposed method can make significant improvement in performance compared with several state-of-the-art methods.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Deep Adversarial Domain Adaptation Based on Multi-layer Joint Kernelized Distance
Authors:
Sitong Mao,
Jiaxin Chen,
Xiao Shen,
Fu-lai Chung
Abstract:
Domain adaptation refers to the learning scenario that a model learned from the source data is applied on the target data which have the same categories but different distribution. While it has been widely applied, the distribution discrepancy between source data and target data can substantially affect the adaptation performance. The problem has been recently addressed by employing adversarial le…
▽ More
Domain adaptation refers to the learning scenario that a model learned from the source data is applied on the target data which have the same categories but different distribution. While it has been widely applied, the distribution discrepancy between source data and target data can substantially affect the adaptation performance. The problem has been recently addressed by employing adversarial learning and distinctive adaptation performance has been reported. In this paper, a deep adversarial domain adaptation model based on a multi-layer joint kernelized distance metric is proposed. By utilizing the abstract features extracted from deep networks, the multi-layer joint kernelized distance (MJKD) between the $j$th target data predicted as the $m$th category and all the source data of the $m'$th category is computed. Base on MJKD, a class-balanced selection strategy is utilized in each category to select target data that are most likely to be classified correctly and treat them as labeled data using their pseudo labels. Then an adversarial architecture is used to draw the newly generated labeled training data and the remaining target data close to each other. In this way, the target data itself provide valuable information to enhance the domain adaptation. An analysis of the proposed method is also given and the experimental results demonstrate that the proposed method can achieve a better performance than a number of state-of-the-art methods.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Ultrasound Modulated Bioluminescence Tomography With A Single Optical Measurement
Authors:
Francis Chung,
Tianyu Yang,
Yang Yang
Abstract:
Ultrasound modulated bioluminescence tomography (UMBLT) is an imaging method which can be formulated as a hybrid inverse source problem. In the regime where light propagation is modeled by a radiative transfer equation, previous approaches to this problem require large numbers of optical measurements [10]. Here we propose an alternative solution for this inverse problem which requires only a singl…
▽ More
Ultrasound modulated bioluminescence tomography (UMBLT) is an imaging method which can be formulated as a hybrid inverse source problem. In the regime where light propagation is modeled by a radiative transfer equation, previous approaches to this problem require large numbers of optical measurements [10]. Here we propose an alternative solution for this inverse problem which requires only a single optical measurement in order to reconstruct the isotropic source. Specifically, we derive two inversion formulae based on Neumann series and Fredholm theory respectively, and prove their convergence under sufficient conditions. The resulting numerical algorithms are implemented and experimented to reconstruct both continuous and discontinuous sources in the presence of noise.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
Adversarial Deep Network Embedding for Cross-network Node Classification
Authors:
Xiao Shen,
Quanyu Dai,
Fu-lai Chung,
Wei Lu,
Kup-Sze Choi
Abstract:
In this paper, the task of cross-network node classification, which leverages the abundant labeled nodes from a source network to help classify unlabeled nodes in a target network, is studied. The existing domain adaptation algorithms generally fail to model the network structural information, and the current network embedding models mainly focus on single-network applications. Thus, both of them…
▽ More
In this paper, the task of cross-network node classification, which leverages the abundant labeled nodes from a source network to help classify unlabeled nodes in a target network, is studied. The existing domain adaptation algorithms generally fail to model the network structural information, and the current network embedding models mainly focus on single-network applications. Thus, both of them cannot be directly applied to solve the cross-network node classification problem. This motivates us to propose an adversarial cross-network deep network embedding (ACDNE) model to integrate adversarial domain adaptation with deep network embedding so as to learn network-invariant node representations that can also well preserve the network structural information. In ACDNE, the deep network embedding module utilizes two feature extractors to jointly preserve attributed affinity and topological proximities between nodes. In addition, a node classifier is incorporated to make node representations label-discriminative. Moreover, an adversarial domain adaptation technique is employed to make node representations network-invariant. Extensive experimental results demonstrate that the proposed ACDNE model achieves the state-of-the-art performance in cross-network node classification.
△ Less
Submitted 17 February, 2020;
originally announced February 2020.
-
Inverse Radiative Transport with Local Data
Authors:
Francis J. Chung
Abstract:
We consider an inverse problem for a radiative transport equation (RTE) in which boundary sources and measurements are restricted to a single subset $E$ of the boundary of the domain $Ω$. We show that this problem can be solved globally if the restriction of the X-ray transform to lines through $E$ is invertible on $Ω$. In particular, if $Ω$ is strictly convex, we show that this local data problem…
▽ More
We consider an inverse problem for a radiative transport equation (RTE) in which boundary sources and measurements are restricted to a single subset $E$ of the boundary of the domain $Ω$. We show that this problem can be solved globally if the restriction of the X-ray transform to lines through $E$ is invertible on $Ω$. In particular, if $Ω$ is strictly convex, we show that this local data problem can be solved globally whenever $E$ is an open subset of the boundary. The proof relies on isolation and analysis of the second term in the collision expansion for solutions to the RTE, essentially considering light which scatters exactly once inside the domain.
△ Less
Submitted 30 January, 2020;
originally announced January 2020.
-
On diffusive scaling in acousto-optic imaging
Authors:
Francis J. Chung,
Ru-Yu Lai,
Qin Li
Abstract:
Acousto-optic imaging (AOI) is a hybrid imaging process. By perturbing the to-be-reconstructed tissues with acoustic waves, one introduces the interaction between the acoustic and optical waves, leading to a more stable reconstruction of the optical properties. The mathematical model was described in [25], with the radiative transfer equation serving as the forward model for the optical transport.…
▽ More
Acousto-optic imaging (AOI) is a hybrid imaging process. By perturbing the to-be-reconstructed tissues with acoustic waves, one introduces the interaction between the acoustic and optical waves, leading to a more stable reconstruction of the optical properties. The mathematical model was described in [25], with the radiative transfer equation serving as the forward model for the optical transport. In this paper we investigate the stability of the reconstruction. In particular, we are interested in how the stability depends on the Knudsen number, Kn, a quantity that measures the intensity of the scattering effect of photon particles in a media. Our analysis shows that as Kn decreases to zero, photons scatter more frequently, and since information is lost, the reconstruction becomes harder. To counter this effect, devices need to be constructed so that laser beam is highly concentrated. We will give a quantitative error bound, and explicitly show that such concentration has an exponential dependence on Kn. Numerical evidence will be provided to verify the proof.
△ Less
Submitted 5 May, 2020; v1 submitted 21 January, 2020;
originally announced January 2020.
-
Variational Metric Scaling for Metric-Based Meta-Learning
Authors:
Jiaxin Chen,
Li-Ming Zhan,
Xiao-Ming Wu,
Fu-lai Chung
Abstract:
Metric-based meta-learning has attracted a lot of attention due to its effectiveness and efficiency in few-shot learning. Recent studies show that metric scaling plays a crucial role in the performance of metric-based meta-learning algorithms. However, there still lacks a principled method for learning the metric scaling parameter automatically. In this paper, we recast metric-based meta-learning…
▽ More
Metric-based meta-learning has attracted a lot of attention due to its effectiveness and efficiency in few-shot learning. Recent studies show that metric scaling plays a crucial role in the performance of metric-based meta-learning algorithms. However, there still lacks a principled method for learning the metric scaling parameter automatically. In this paper, we recast metric-based meta-learning from a Bayesian perspective and develop a variational metric scaling framework for learning a proper metric scaling parameter. Firstly, we propose a stochastic variational method to learn a single global scaling parameter. To better fit the embedding space to a given data distribution, we extend our method to learn a dimensional scaling vector to transform the embedding space. Furthermore, to learn task-specific embeddings, we generate task-dependent dimensional scaling vectors with amortized variational inference. Our method is end-to-end without any pre-training and can be used as a simple plug-and-play module for existing metric-based meta-algorithms. Experiments on mini-ImageNet show that our methods can be used to consistently improve the performance of existing metric-based meta-algorithms including prototypical networks and TADAM. The source code can be downloaded from https://github.com/jiaxinchen666/variational-scaling.
△ Less
Submitted 26 August, 2020; v1 submitted 26 December, 2019;
originally announced December 2019.
-
Heterogeneous tissue characterization using ultrasound: a comparison of fractal analysis backscatter models on liver tumors
Authors:
Omar S. Al-Kadi,
Daniel Y. F. Chung,
Constantin C. Coussios,
J. Alison Noble
Abstract:
Assessing tumor tissue heterogeneity via ultrasound has recently been suggested for predicting early response to treatment. The ultrasound backscattering characteristics can assist in better understanding the tumor texture by highlighting local concentration and spatial arrangement of tissue scatterers. However, it is challenging to quantify the various tissue heterogeneities ranging from fine-to-…
▽ More
Assessing tumor tissue heterogeneity via ultrasound has recently been suggested for predicting early response to treatment. The ultrasound backscattering characteristics can assist in better understanding the tumor texture by highlighting local concentration and spatial arrangement of tissue scatterers. However, it is challenging to quantify the various tissue heterogeneities ranging from fine-to-coarse of the echo envelope peaks in tumor texture. Local parametric fractal features extracted via maximum likelihood estimation from five well-known statistical model families are evaluated for the purpose of ultrasound tissue characterization. The fractal dimension (self-similarity measure) was used to characterize the spatial distribution of scatterers, while the Lacunarity (sparsity measure) was applied to determine scatterer number density. Performance was assessed based on 608 cross-sectional clinical ultrasound RF images of liver tumors (230 and 378 demonstrating respondent and non-respondent cases, respectively). Crossvalidation via leave-one-tumor-out and with different k-folds methodologies using a Bayesian classifier were employed for validation. The fractal properties of the backscattered echoes based on the Nakagami model (Nkg) and its extend four-parameter Nakagami-generalized inverse Gaussian (NIG) distribution achieved best results - with nearly similar performance - for characterizing liver tumor tissue. Accuracy, sensitivity and specificity for the Nkg/NIG were: 85.6%/86.3%, 94.0%/96.0%, and 73.0%/71.0%, respectively. Other statistical models, such as the Rician, Rayleigh, and K-distribution were found to not be as effective in characterizing the subtle changes in tissue texture as an indication of response to treatment. Employing the most relevant and practical statistical model could have potential consequences for the design of an early and effective clinical therapy.
△ Less
Submitted 20 December, 2019;
originally announced December 2019.
-
A Note on the Transport Method for Hybrid Inverse Problems
Authors:
Francis J. Chung,
Jeremy G. Hoskins,
John C. Schotland
Abstract:
There are several hybrid inverse problems for equations of the form $\nabla \cdot D \nabla u - σu = 0$ in which we want to obtain the coefficients $D$ and $σ$ on a domain $Ω$ when the solutions $u$ are known. One approach is to use two solutions $u_1$ and $u_2$ to obtain a transport equation for the coefficient $D$, and then solve this equation inward from the boundary along the integral curves of…
▽ More
There are several hybrid inverse problems for equations of the form $\nabla \cdot D \nabla u - σu = 0$ in which we want to obtain the coefficients $D$ and $σ$ on a domain $Ω$ when the solutions $u$ are known. One approach is to use two solutions $u_1$ and $u_2$ to obtain a transport equation for the coefficient $D$, and then solve this equation inward from the boundary along the integral curves of a vector field $X$ defined by $u_1$ and $u_2$. It follows from an argument of Guillaume Bal and Kui Ren that for any nontrivial choices of $u_1$ and $u_2$, this method suffices to recover the coefficients on a dense set in $Ω$. This short note presents an alternate proof of the same result from a dynamical systems point of view.
△ Less
Submitted 9 December, 2019; v1 submitted 10 October, 2019;
originally announced October 2019.
-
A Transport Model for Multi-Frequency Acousto-Optic Tomography
Authors:
Francis J. Chung,
Jeremy G. Hoskins,
John C. Schotland
Abstract:
In a medium where the dielectric permittivity is perturbed in the presence of an acoustic wave, optical scattering generates frequency-shifted light. In this paper we consider the inverse problem of recovering the optical properties of this medium from measurements of the frequency-shifted light, using a radiative transport equation (RTE) model for light propagation. Given some assumptions on the…
▽ More
In a medium where the dielectric permittivity is perturbed in the presence of an acoustic wave, optical scattering generates frequency-shifted light. In this paper we consider the inverse problem of recovering the optical properties of this medium from measurements of the frequency-shifted light, using a radiative transport equation (RTE) model for light propagation. Given some assumptions on the regularity and isotropicity of the coefficients of the RTE, we show that the absorption coefficient can be reconstructed from the boundary measurements of a single well chosen illumination, and that the scattering coefficients can be reconstructed from boundary measurements of a one-parameter family of illuminations.
△ Less
Submitted 10 October, 2019;
originally announced October 2019.
-
Strain-stress study of AlxGa1-xN/AlN heterostructures on c-plane sapphire and related optical properties
Authors:
Y. Feng,
V. Saravade,
T. F. Chung,
Y. Dong,
H. Zhou,
B. Kucukgok,
I. T. Ferguson,
N. Lu
Abstract:
This work presents a systematic study of stress and strain of AlxGa1-xN/AlN with composition ranging from GaN to AlN, grown on a c-plane sapphire by metal-organic chemical vapor deposition, using synchrotron radiation high-resolution X-ray diffraction and reciprocal space map**. The c-plane of the AlxGa1-xN epitaxial layers exhibits compressive strain, while the a-plane exhibits tensile strain.…
▽ More
This work presents a systematic study of stress and strain of AlxGa1-xN/AlN with composition ranging from GaN to AlN, grown on a c-plane sapphire by metal-organic chemical vapor deposition, using synchrotron radiation high-resolution X-ray diffraction and reciprocal space map**. The c-plane of the AlxGa1-xN epitaxial layers exhibits compressive strain, while the a-plane exhibits tensile strain. The biaxial stress and strain are found to increase with increasing Al composition, although the lattice mismatch between the AlxGa1-xN and the buffer layer AlN gets smaller. A reduction in the lateral coherence lengths and an increase in the edge and screw dislocations are seen as the AlxGa1-xN composition is varied from GaN to AlN, exhibiting a clear dependence of the crystal properties of AlxGa1-xN on the Al content. The bandgap of the epitaxial layers is slightly lower than predicted value due to a larger tensile strain effect on the a-axis compared to the compressive strain on the c-axis. Raman characteristics of the AlxGa1-xN samples exhibit a shift in the phonon peaks with the Al composition. The effect of strain is also discussed on the optical phonon energies of the epitaxial layers. The techniques discussed here can be used to study other similar materials.
△ Less
Submitted 10 May, 2019;
originally announced May 2019.
-
Slow Fibonacci Walks
Authors:
Fan Chung,
Ron Graham,
Sam Spiro
Abstract:
For a positive integer $n$, we study the number of steps to reach $n$ by a {\it Fibonacci walk} for some starting pair $a_1$ and $a_2$ satisfying the recurrence of $a_{k+2}=a_{k+1}+a_k$. The problem of slow Fibonacci walks, first suggested by Richard Stanley, is to determine the maximum number $s(n)$ of steps for such a Fibonacci walk ending at $n$. Stanley conjectured that for most $n$, there is…
▽ More
For a positive integer $n$, we study the number of steps to reach $n$ by a {\it Fibonacci walk} for some starting pair $a_1$ and $a_2$ satisfying the recurrence of $a_{k+2}=a_{k+1}+a_k$. The problem of slow Fibonacci walks, first suggested by Richard Stanley, is to determine the maximum number $s(n)$ of steps for such a Fibonacci walk ending at $n$. Stanley conjectured that for most $n$, there is a slow Fibonacci walk reaching $n = a_s$ with the property that $a_{s+1}$ is the integer closest to $φn$ where $φ=(1+\sqrt{5})/2$. We prove that this is true for only a positive fraction of $n$. We give explicit formulas for the choice of the starting pairs and the determination of $s(n)$ by giving a characterization theorem. We also derive a number of density results concerning the distribution of down and up cases (that is, those $n$ with $a_{s+1}=\lfloor φn\rfloor$ or $\lceil φn \rceil$, respectively), as well as for more general `paradoxical' cases.
△ Less
Submitted 19 March, 2019;
originally announced March 2019.
-
Network Together: Node Classification via Cross network Deep Network Embedding
Authors:
Xiao Shen,
Quanyu Dai,
Sitong Mao,
Fu-lai Chung,
Kup-Sze Choi
Abstract:
Network embedding is a highly effective method to learn low-dimensional node vector representations with original network structures being well preserved. However, existing network embedding algorithms are mostly developed for a single network, which fail to learn generalized feature representations across different networks. In this paper, we study a cross-network node classification problem, whi…
▽ More
Network embedding is a highly effective method to learn low-dimensional node vector representations with original network structures being well preserved. However, existing network embedding algorithms are mostly developed for a single network, which fail to learn generalized feature representations across different networks. In this paper, we study a cross-network node classification problem, which aims at leveraging the abundant labeled information from a source network to help classify the unlabeled nodes in a target network. To succeed in such a task, transferable features should be learned for nodes across different networks. To this end, a novel cross-network deep network embedding (CDNE) model is proposed to incorporate domain adaptation into deep network embedding so as to learn label-discriminative and network-invariant node vector representations. On one hand, CDNE leverages network structures to capture the proximities between nodes within a network, by map** more strongly connected nodes to have more similar latent vector representations. On the other hand, node attributes and labels are leveraged to capture the proximities between nodes across different networks by making the same labeled nodes across networks have aligned latent vector representations. Extensive experiments have been conducted, demonstrating that the proposed CDNE model significantly outperforms the state-of-the-art network embedding algorithms in cross-network node classification.
△ Less
Submitted 5 June, 2020; v1 submitted 22 January, 2019;
originally announced January 2019.
-
Deep Network Embedding for Graph Representation Learning in Signed Networks
Authors:
Xiao Shen,
Fu-Lai Chung
Abstract:
Network embedding has attracted an increasing attention over the past few years. As an effective approach to solve graph mining problems, network embedding aims to learn a low-dimensional feature vector representation for each node of a given network. The vast majority of existing network embedding algorithms, however, are only designed for unsigned networks, and the signed networks containing bot…
▽ More
Network embedding has attracted an increasing attention over the past few years. As an effective approach to solve graph mining problems, network embedding aims to learn a low-dimensional feature vector representation for each node of a given network. The vast majority of existing network embedding algorithms, however, are only designed for unsigned networks, and the signed networks containing both positive and negative links, have pretty distinct properties from the unsigned counterpart. In this paper, we propose a deep network embedding model to learn the low-dimensional node vector representations with structural balance preservation for the signed networks. The model employs a semi-supervised stacked auto-encoder to reconstruct the adjacency connections of a given signed network. As the adjacency connections are overwhelmingly positive in the real-world signed networks, we impose a larger penalty to make the auto-encoder focus more on reconstructing the scarce negative links than the abundant positive links. In addition, to preserve the structural balance property of signed networks, we design the pairwise constraints to make the positively connected nodes much closer than the negatively connected nodes in the embedding space. Based on the network representations learned by the proposed model, we conduct link sign prediction and community detection in signed networks. Extensive experimental results in real-world datasets demonstrate the superiority of the proposed model over the state-of-the-art network embedding algorithms for graph representation learning in signed networks.
△ Less
Submitted 7 January, 2019;
originally announced January 2019.
-
The maximum relaxation time of a random walk
Authors:
Sinan G. Aksoy,
Fan Chung,
Michael Tait,
Josh Tobin
Abstract:
We show the minimum spectral gap of the normalized Laplacian over all simple, connected graphs on $n$ vertices is $(1+o(1))\tfrac{54}{n^3}$. This minimum is achieved asymptotically by a double kite graph. Consequently, this leads to sharp upper bounds for the maximum relaxation time of a random walk, settling a conjecture of Aldous and Fill. We also improve an eigenvalue-diameter inequality by giv…
▽ More
We show the minimum spectral gap of the normalized Laplacian over all simple, connected graphs on $n$ vertices is $(1+o(1))\tfrac{54}{n^3}$. This minimum is achieved asymptotically by a double kite graph. Consequently, this leads to sharp upper bounds for the maximum relaxation time of a random walk, settling a conjecture of Aldous and Fill. We also improve an eigenvalue-diameter inequality by giving a new lower bound for the spectral gap of the normalized Laplacian. This eigenvalue lower bound is asymptotically best possible.
△ Less
Submitted 10 July, 2018; v1 submitted 16 April, 2018;
originally announced April 2018.
-
The $L^p$ Carleman estimate and a partial data inverse problem
Authors:
Francis J. Chung,
Leo Tzou
Abstract:
We construct an explicit Green's function for the conjugated Laplacian $e^{-ω\cdot x/h}Δe^{-ω\cdot x/h}$, which let us control our solutions on roughly half of the boundary. We apply the Green's function to solve a partial data inverse problem for the Schrödinger equation with potential $q \in L^{n/2}$. We also use this Green's function to derive $L^p$ Carleman estimates similar to the ones in Ken…
▽ More
We construct an explicit Green's function for the conjugated Laplacian $e^{-ω\cdot x/h}Δe^{-ω\cdot x/h}$, which let us control our solutions on roughly half of the boundary. We apply the Green's function to solve a partial data inverse problem for the Schrödinger equation with potential $q \in L^{n/2}$. We also use this Green's function to derive $L^p$ Carleman estimates similar to the ones in Kenig-Ruiz-Sogge \cite{krs}, but for functions with support up to part of the boundary.
△ Less
Submitted 5 October, 2016;
originally announced October 2016.
-
Inverse Transport and Acousto-Optic Imaging
Authors:
Francis J Chung,
John C Schotland
Abstract:
We consider the inverse problem of recovering the optical properties of a highly-scattering medium from acousto-optic measurements. Using such measurements, we show that the scattering and absorption coefficients of the radiative transport equation can be reconstructed with Lipschitz stability by means of algebraic inversion formulas.
We consider the inverse problem of recovering the optical properties of a highly-scattering medium from acousto-optic measurements. Using such measurements, we show that the scattering and absorption coefficients of the radiative transport equation can be reconstructed with Lipschitz stability by means of algebraic inversion formulas.
△ Less
Submitted 26 September, 2016;
originally announced September 2016.
-
Optical tomography on graphs
Authors:
Francis J. Chung,
Anna C. Gilbert,
Jeremy G. Hoskins,
John C. Schotland
Abstract:
We present an algorithm for solving inverse problems on graphs analogous to those arising in diffuse optical tomography for continuous media. In particular, we formulate and analyze a discrete version of the inverse Born series, proving estimates characterizing the domain of convergence, approximation errors, and stability of our approach. We also present a modification which allows additional inf…
▽ More
We present an algorithm for solving inverse problems on graphs analogous to those arising in diffuse optical tomography for continuous media. In particular, we formulate and analyze a discrete version of the inverse Born series, proving estimates characterizing the domain of convergence, approximation errors, and stability of our approach. We also present a modification which allows additional information on the structure of the potential to be incorporated, facilitating recovery for a broader class of problems.
△ Less
Submitted 10 September, 2016;
originally announced September 2016.
-
Extreme values of the stationary distribution of random walks on directed graphs
Authors:
Sinan Aksoy,
Fan Chung,
Xing Peng
Abstract:
We examine the stationary distribution of random walks on directed graphs. In particular, we focus on the {\em principal ratio}, which is the ratio of maximum to minimum values of vertices in the stationary distribution. We give an upper bound for this ratio over all strongly connected graphs on $n$ vertices. We characterize all graphs achieving the upper bound and we give explicit constructions f…
▽ More
We examine the stationary distribution of random walks on directed graphs. In particular, we focus on the {\em principal ratio}, which is the ratio of maximum to minimum values of vertices in the stationary distribution. We give an upper bound for this ratio over all strongly connected graphs on $n$ vertices. We characterize all graphs achieving the upper bound and we give explicit constructions for these extremal graphs. Additionally, we show that under certain conditions, the principal ratio is tightly bounded. We also provide counterexamples to show the principal ratio cannot be tightly bounded under weaker conditions.
△ Less
Submitted 2 February, 2016;
originally announced February 2016.
-
Quantification of Ultrasonic Texture heterogeneity via Volumetric Stochastic Modeling for Tissue Characterization
Authors:
O. S. Al-Kadi,
Daniel Y. F. Chung,
Robert C. Carlisle,
Constantin C. Coussios,
J. Alison Noble
Abstract:
Intensity variations in image texture can provide powerful quantitative information about physical properties of biological tissue. However, tissue patterns can vary according to the utilized imaging system and are intrinsically correlated to the scale of analysis. In the case of ultrasound, the Nakagami distribution is a general model of the ultrasonic backscattering envelope under various scatte…
▽ More
Intensity variations in image texture can provide powerful quantitative information about physical properties of biological tissue. However, tissue patterns can vary according to the utilized imaging system and are intrinsically correlated to the scale of analysis. In the case of ultrasound, the Nakagami distribution is a general model of the ultrasonic backscattering envelope under various scattering conditions and densities where it can be employed for characterizing image texture, but the subtle intra-heterogeneities within a given mass are difficult to capture via this model as it works at a single spatial scale. This paper proposes a locally adaptive 3D multi-resolution Nakagami-based fractal feature descriptor that extends Nakagami-based texture analysis to accommodate subtle speckle spatial frequency tissue intensity variability in volumetric scans. Local textural fractal descriptors - which are invariant to affine intensity changes - are extracted from volumetric patches at different spatial resolutions from voxel lattice-based generated shape and scale Nakagami parameters. Using ultrasound radio-frequency datasets we found that after applying an adaptive fractal decomposition label transfer approach on top of the generated Nakagami voxels, tissue characterization results were superior to the state of art. Experimental results on real 3D ultrasonic pre-clinical and clinical datasets suggest that describing tumor intra-heterogeneity via this descriptor may facilitate improved prediction of therapy response and disease characterization.
△ Less
Submitted 14 January, 2016;
originally announced January 2016.
-
Theoretic Analysis and Extremely Easy Algorithms for Domain Adaptive Feature Learning
Authors:
Wenhao Jiang,
Cheng Deng,
Wei Liu,
Fei** Nie,
Fu-lai Chung,
Heng Huang
Abstract:
Domain adaptation problems arise in a variety of applications, where a training dataset from the \textit{source} domain and a test dataset from the \textit{target} domain typically follow different distributions. The primary difficulty in designing effective learning models to solve such problems lies in how to bridge the gap between the source and target distributions. In this paper, we provide c…
▽ More
Domain adaptation problems arise in a variety of applications, where a training dataset from the \textit{source} domain and a test dataset from the \textit{target} domain typically follow different distributions. The primary difficulty in designing effective learning models to solve such problems lies in how to bridge the gap between the source and target distributions. In this paper, we provide comprehensive analysis of feature learning algorithms used in conjunction with linear classifiers for domain adaptation. Our analysis shows that in order to achieve good adaptation performance, the second moments of the source domain distribution and target domain distribution should be similar. Based on our new analysis, a novel extremely easy feature learning algorithm for domain adaptation is proposed. Furthermore, our algorithm is extended by leveraging multiple layers, leading to a deep linear model. We evaluate the effectiveness of the proposed algorithms in terms of domain adaptation tasks on the Amazon review dataset and the spam dataset from the ECML/PKDD 2006 discovery challenge.
△ Less
Submitted 10 August, 2017; v1 submitted 5 September, 2015;
originally announced September 2015.
-
Finding Consensus in Multi-Agent Networks Using Heat Kernel Pagerank
Authors:
Fan Chung,
Olivia Simpson
Abstract:
We present a new and efficient algorithm for determining a consensus value for a network of agents. Different from existing algorithms, our algorithm evaluates the consensus value for very large networks using heat kernel pagerank. We consider two frameworks for the consensus problem, a weighted average consensus among all agents, and consensus in a leader-following formation. Using a heat kernel…
▽ More
We present a new and efficient algorithm for determining a consensus value for a network of agents. Different from existing algorithms, our algorithm evaluates the consensus value for very large networks using heat kernel pagerank. We consider two frameworks for the consensus problem, a weighted average consensus among all agents, and consensus in a leader-following formation. Using a heat kernel pagerank approximation, we give consensus algorithms that run in time sublinear in the size of the network, and provide quantitative analysis of the tradeoff between performance guarantees and error estimates.
△ Less
Submitted 31 July, 2015;
originally announced July 2015.
-
Distributed Algorithms for Finding Local Clusters Using Heat Kernel Pagerank
Authors:
Fan Chung,
Olivia Simpson
Abstract:
A distributed algorithm performs local computations on pieces of input and communicates the results through given communication links. When processing a massive graph in a distributed algorithm, local outputs must be configured as a solution to a graph problem without shared memory and with few rounds of communication. In this paper we consider the problem of computing a local cluster in a massive…
▽ More
A distributed algorithm performs local computations on pieces of input and communicates the results through given communication links. When processing a massive graph in a distributed algorithm, local outputs must be configured as a solution to a graph problem without shared memory and with few rounds of communication. In this paper we consider the problem of computing a local cluster in a massive graph in the distributed setting. Computing local clusters are of certain application-specific interests, such as detecting communities in social networks or groups of interacting proteins in biological networks. When the graph models the computer network itself, detecting local clusters can help to prevent communication bottlenecks. We give a distributed algorithm that computes a local cluster in time that depends only logarithmically on the size of the graph in the CONGEST model. In particular, when the value of the optimal local cluster is known, the algorithm runs in time entirely independent of the size of the graph and depends only on error bounds for approximation. We also show that the local cluster problem can be computed in the k-machine distributed model in sublinear time. The speedup of our local cluster algorithms is mainly due to the use of our distributed algorithm for heat kernel pagerank.
△ Less
Submitted 2 December, 2016; v1 submitted 31 July, 2015;
originally announced July 2015.
-
Ultrasound modulated bioluminescence tomography and controllability of the radiative transport equation
Authors:
Guillaume Bal,
Francis J. Chung,
John C. Schotland
Abstract:
We propose a method to reconstruct the density of an optical source in a highly scattering medium from ultrasound-modulated optical measurements. Our approach is based on the solution to a hybrid inverse source problem for the radiative transport equation (RTE). A controllability result for the RTE plays an essential role in the analysis.
We propose a method to reconstruct the density of an optical source in a highly scattering medium from ultrasound-modulated optical measurements. Our approach is based on the solution to a hybrid inverse source problem for the radiative transport equation (RTE). A controllability result for the RTE plays an essential role in the analysis.
△ Less
Submitted 21 June, 2015;
originally announced June 2015.
-
Juggling card sequences
Authors:
Steve Butler,
Fan Chung,
Jay Cummings,
Ron Graham
Abstract:
Juggling patterns can be described by a sequence of cards which keep track of the relative order of the balls at each step. This interpretation has many algebraic and combinatorial properties, with connections to Stirling numbers, Dyck paths, Narayana numbers, boson normal ordering, arc-labeled digraphs, and more. Some of these connections are investigated with a particular focus on enumerating ju…
▽ More
Juggling patterns can be described by a sequence of cards which keep track of the relative order of the balls at each step. This interpretation has many algebraic and combinatorial properties, with connections to Stirling numbers, Dyck paths, Narayana numbers, boson normal ordering, arc-labeled digraphs, and more. Some of these connections are investigated with a particular focus on enumerating juggling patterns satisfying certain ordering constraints, including where the number of crossings is fixed.
△ Less
Submitted 6 April, 2015;
originally announced April 2015.
-
A General Framework for Multi-focal Image Classification and Authentication: Application to Microscope Pollen Images
Authors:
François Chung,
Tomás Rodríguez
Abstract:
In this article, we propose a general framework for multi-focal image classification and authentication, the methodology being demonstrated on microscope pollen images. The framework is meant to be generic and based on a brute force-like approach aimed to be efficient not only on any kind, and any number, of pollen images (regardless of the pollen type), but also on any kind of multi-focal images.…
▽ More
In this article, we propose a general framework for multi-focal image classification and authentication, the methodology being demonstrated on microscope pollen images. The framework is meant to be generic and based on a brute force-like approach aimed to be efficient not only on any kind, and any number, of pollen images (regardless of the pollen type), but also on any kind of multi-focal images. All stages of the framework's pipeline are designed to be used in an automatic fashion. First, the optimal focus is selected using the absolute gradient method. Then, pollen grains are extracted using a coarse-to-fine approach involving both clustering and morphological techniques (coarse stage), and a snake-based segmentation (fine stage). Finally, features are extracted and selected using a generalized approach, and their classification is tested with four classifiers: Weighted Neighbor Distance, Neural Network, Decision Tree and Random Forest. The latter method, which has shown the best and more robust classification accuracy results (above 97\% for any number of pollen types), is finally used for the authentication stage.
△ Less
Submitted 19 March, 2015;
originally announced March 2015.
-
Automatic Pollen Grain and Exine Segmentation from Microscope Images
Authors:
François Chung,
Tomás Rodríguez
Abstract:
In this article, we propose an automatic method for the segmentation of pollen grains from microscope images, followed by the automatic segmentation of their exine. The objective of exine segmentation is to separate the pollen grain in two regions of interest: exine and inner part. A coarse-to-fine approach ensures a smooth and accurate segmentation of both structures. As a rough stage, grain segm…
▽ More
In this article, we propose an automatic method for the segmentation of pollen grains from microscope images, followed by the automatic segmentation of their exine. The objective of exine segmentation is to separate the pollen grain in two regions of interest: exine and inner part. A coarse-to-fine approach ensures a smooth and accurate segmentation of both structures. As a rough stage, grain segmentation is performed by a procedure involving clustering and morphological operations, while the exine is approximated by an iterative procedure consisting in consecutive crop** steps of the pollen grain. A snake-based segmentation is performed to refine the segmentation of both structures. Results have shown that our segmentation method is able to deal with different pollen types, as well as with different types of exine and inner part appearance. The proposed segmentation method aims to be generic and has been designed as one of the core steps of an automatic pollen classification framework.
△ Less
Submitted 19 March, 2015;
originally announced March 2015.
-
Solving Local Linear Systems with Boundary Conditions Using Heat Kernel Pagerank
Authors:
Fan Chung,
Olivia Simpson
Abstract:
We present an efficient algorithm for solving local linear systems with a boundary condition using the Green's function of a connected induced subgraph related to the system. We introduce the method of using the Dirichlet heat kernel pagerank vector to approximate local solutions to linear systems in the graph Laplacian satisfying given boundary conditions over a particular subset of vertices. Wit…
▽ More
We present an efficient algorithm for solving local linear systems with a boundary condition using the Green's function of a connected induced subgraph related to the system. We introduce the method of using the Dirichlet heat kernel pagerank vector to approximate local solutions to linear systems in the graph Laplacian satisfying given boundary conditions over a particular subset of vertices. With an efficient algorithm for approximating Dirichlet heat kernel pagerank, our local linear solver algorithm computes an approximate local solution with multiplicative and additive error $ε$ by performing $O(ε^{-5}s^3\log(s^3ε^{-1})\log n)$ random walk steps, where $n$ is the number of vertices in the full graph and $s$ is the size of the local system on the induced subgraph.
△ Less
Submitted 10 March, 2015;
originally announced March 2015.
-
Computing Heat Kernel Pagerank and a Local Clustering Algorithm
Authors:
Fan Chung,
Olivia Simpson
Abstract:
Heat kernel pagerank is a variation of Personalized PageRank given in an exponential formulation. In this work, we present a sublinear time algorithm for approximating the heat kernel pagerank of a graph. The algorithm works by simulating random walks of bounded length and runs in time $O\big(\frac{\log(ε^{-1})\log n}{ε^3\log\log(ε^{-1})}\big)$, assuming performing a random walk step and sampling…
▽ More
Heat kernel pagerank is a variation of Personalized PageRank given in an exponential formulation. In this work, we present a sublinear time algorithm for approximating the heat kernel pagerank of a graph. The algorithm works by simulating random walks of bounded length and runs in time $O\big(\frac{\log(ε^{-1})\log n}{ε^3\log\log(ε^{-1})}\big)$, assuming performing a random walk step and sampling from a distribution with bounded support take constant time.
The quantitative ranking of vertices obtained with heat kernel pagerank can be used for local clustering algorithms. We present an efficient local clustering algorithm that finds cuts by performing a sweep over a heat kernel pagerank vector, using the heat kernel pagerank approximation algorithm as a subroutine. Specifically, we show that for a subset $S$ of Cheeger ratio $φ$, many vertices in $S$ may serve as seeds for a heat kernel pagerank vector which will find a cut of conductance $O(\sqrtφ)$.
△ Less
Submitted 15 December, 2016; v1 submitted 10 March, 2015;
originally announced March 2015.
-
Partial Data Inverse Problems for Maxwell Equations via Carleman Estimates
Authors:
Francis J. Chung,
Petri Ola,
Mikko Salo,
Leo Tzou
Abstract:
In this article we consider an inverse boundary value problem for the time-harmonic Maxwell equations. We show that the electromagnetic material parameters are determined by boundary measurements where part of the boundary data is measured on a possibly very small set. This is an extension of earlier scalar results of Bukhgeim-Uhlmann and Kenig-Sjöstrand-Uhlmann to the Maxwell system. The main con…
▽ More
In this article we consider an inverse boundary value problem for the time-harmonic Maxwell equations. We show that the electromagnetic material parameters are determined by boundary measurements where part of the boundary data is measured on a possibly very small set. This is an extension of earlier scalar results of Bukhgeim-Uhlmann and Kenig-Sjöstrand-Uhlmann to the Maxwell system. The main contribution is to show that the Carleman estimate approach to scalar partial data inverse problems introduced in those works can be carried over to the Maxwell system.
△ Less
Submitted 5 February, 2015;
originally announced February 2015.
-
Transfer Prototype-based Fuzzy Clustering
Authors:
Zhaohong Deng,
Yizhang Jiang,
Fu-Lai Chung,
Hisao Ishibuchi,
Kup-Sze Choi,
Shitong Wang
Abstract:
The traditional prototype based clustering methods, such as the well-known fuzzy c-mean (FCM) algorithm, usually need sufficient data to find a good clustering partition. If the available data is limited or scarce, most of the existing prototype based clustering algorithms will no longer be effective. While the data for the current clustering task may be scarce, there is usually some useful knowle…
▽ More
The traditional prototype based clustering methods, such as the well-known fuzzy c-mean (FCM) algorithm, usually need sufficient data to find a good clustering partition. If the available data is limited or scarce, most of the existing prototype based clustering algorithms will no longer be effective. While the data for the current clustering task may be scarce, there is usually some useful knowledge available in the related scenes/domains. In this study, the concept of transfer learning is applied to prototype based fuzzy clustering (PFC). Specifically, the idea of leveraging knowledge from the source domain is exploited to develop a set of transfer prototype based fuzzy clustering (TPFC) algorithms. Three prototype based fuzzy clustering algorithms, namely, FCM, fuzzy k-plane clustering (FKPC) and fuzzy subspace clustering (FSC), have been chosen to incorporate with knowledge leveraging mechanism to develop the corresponding transfer clustering algorithms. Novel objective functions are proposed to integrate the knowledge of source domain with the data of target domain for clustering in the target domain. The proposed algorithms have been validated on different synthetic and real-world datasets and the results demonstrate their effectiveness when compared with both the original prototype based fuzzy clustering algorithms and the related clustering algorithms like multi-task clustering and co-clustering.
△ Less
Submitted 5 April, 2016; v1 submitted 19 September, 2014;
originally announced September 2014.
-
Partial Data for the Neumann-Dirichlet Magnetic Schrödinger Inverse Problem
Authors:
Francis J. Chung
Abstract:
We show that an electric potential and magnetic field can be uniquely determined by partial boundary measurements of the Neumann-to-Dirichlet map of the associated magnetic Schrödinger operator. This improves upon previous results of the author by including the determination of a magnetic field. The main technical advance is an improvement on the Carleman estimate for the magnetic Schrödinger oper…
▽ More
We show that an electric potential and magnetic field can be uniquely determined by partial boundary measurements of the Neumann-to-Dirichlet map of the associated magnetic Schrödinger operator. This improves upon previous results of the author by including the determination of a magnetic field. The main technical advance is an improvement on the Carleman estimate for the magnetic Schrödinger operator with the appropriate boundary conditions. This allows the construction of complex geometrical optics solutions with greater regularity, which are needed to deal with the first order term in the operator. This improved regularity of CGO solutions may have applications in the study of inverse problems in systems of equations with partial boundary data.
△ Less
Submitted 18 February, 2014;
originally announced February 2014.
-
Decomposition of random graphs into complete bipartite graphs
Authors:
Fan Chung,
Xing Peng
Abstract:
We consider the problem of partitioning the edge set of a graph $G$ into the minimum number $τ(G)$ of edge-disjoint complete bipartite subgraphs. We show that for a random graph $G$ in $G(n,p)$, for $p$ is a constant no greater than $1/2$, almost surely $τ(G)$ is between $n- c(\ln_{1/p} n)^{3+ε}$ and $n - 2\ln_{1/(1-p)} n$ for any positive constants $c$ and $ε$.
We consider the problem of partitioning the edge set of a graph $G$ into the minimum number $τ(G)$ of edge-disjoint complete bipartite subgraphs. We show that for a random graph $G$ in $G(n,p)$, for $p$ is a constant no greater than $1/2$, almost surely $τ(G)$ is between $n- c(\ln_{1/p} n)^{3+ε}$ and $n - 2\ln_{1/(1-p)} n$ for any positive constants $c$ and $ε$.
△ Less
Submitted 26 November, 2015; v1 submitted 4 February, 2014;
originally announced February 2014.
-
Partial data inverse problems for the Hodge Laplacian
Authors:
Francis J. Chung,
Mikko Salo,
Leo Tzou
Abstract:
We prove uniqueness results for a Calderon type inverse problem for the Hodge Laplacian acting on graded forms on certain manifolds in three dimensions. In particular, we show that partial measurements of the relative-to-absolute or absolute-to-relative boundary value maps uniquely determine a zeroth order potential. The method is based on Carleman estimates for the Hodge Laplacian with relative o…
▽ More
We prove uniqueness results for a Calderon type inverse problem for the Hodge Laplacian acting on graded forms on certain manifolds in three dimensions. In particular, we show that partial measurements of the relative-to-absolute or absolute-to-relative boundary value maps uniquely determine a zeroth order potential. The method is based on Carleman estimates for the Hodge Laplacian with relative or absolute boundary conditions, and on the construction of complex geometric optics solutions which reduce the Calderon type problem to a tensor tomography problem for 2-tensors. The arguments in this paper allow to establish partial data results for elliptic systems that generalize the scalar results due to Kenig-Sjostrand-Uhlmann.
△ Less
Submitted 12 May, 2016; v1 submitted 17 October, 2013;
originally announced October 2013.
-
Synthetic Graphene Grown by Chemical Vapor Deposition on Copper Foils
Authors:
Ting Fung Chung,
Tian Shen,
Helin Cao,
Luis A. Jauregui,
Wei Wu,
Qingkai Yu,
David Newell,
Yong P. Chen
Abstract:
The discovery of graphene, a single layer of covalently bonded carbon atoms, has attracted intense interests. Initial studies using mechanically exfoliated graphene unveiled its remarkable electronic, mechanical and thermal properties. There has been a growing need and rapid development in large-area deposition of graphene film and its applications. Chemical vapour deposition on copper has emerged…
▽ More
The discovery of graphene, a single layer of covalently bonded carbon atoms, has attracted intense interests. Initial studies using mechanically exfoliated graphene unveiled its remarkable electronic, mechanical and thermal properties. There has been a growing need and rapid development in large-area deposition of graphene film and its applications. Chemical vapour deposition on copper has emerged as one of the most promising methods in obtaining large-scale graphene films with quality comparable to exfoliated graphene. In this chapter, we review the synthesis and characterizations of graphene grown on copper foil substrates by atmospheric pressure chemical vapour deposition. We also discuss potential applications of such large scale synthetic graphene.
△ Less
Submitted 22 July, 2013;
originally announced July 2013.
-
Partial Data for the Neumann-to-Dirichlet Map
Authors:
Francis J. Chung
Abstract:
We show that measurements of the Neumann-to-Dirichlet map, roughly speaking, on a certain part of the boundary of a smooth domain in dimension 3 or higher, for inputs with support restricted to the other part, determine an electric potential on that domain. Given a convexity condition on the domain, either the set on which measurements are taken, or the set on which input functions are supported,…
▽ More
We show that measurements of the Neumann-to-Dirichlet map, roughly speaking, on a certain part of the boundary of a smooth domain in dimension 3 or higher, for inputs with support restricted to the other part, determine an electric potential on that domain. Given a convexity condition on the domain, either the set on which measurements are taken, or the set on which input functions are supported, can be made to be arbitrarily small. The result is analogous to the result by Kenig, Sjöstrand, and Uhlmann for the Dirichlet-to-Neumann map. The main new ingredient in the proof is a Carleman estimate for the Schrödinger operator with appropriate boundary conditions.
△ Less
Submitted 20 October, 2013; v1 submitted 1 November, 2012;
originally announced November 2012.
-
Harnack inequalities for graphs with non-negative Ricci curvature
Authors:
Fan Chung,
Yong Lin,
Shing-Tung Yau
Abstract:
We establish a Harnack inequality for finite connected graphs with non-negative Ricci curvature. As a consequence, we derive an eigenvalue lower bound, extending previous results for Ricci flat graphs.
We establish a Harnack inequality for finite connected graphs with non-negative Ricci curvature. As a consequence, we derive an eigenvalue lower bound, extending previous results for Ricci flat graphs.
△ Less
Submitted 27 July, 2012;
originally announced July 2012.