-
Homophily-adjusted social influence estimation
Authors:
Hanh T. D. Pham,
Daniel K. Sewell
Abstract:
Homophily and social influence are two key concepts of social network analysis. Distinguishing between these phenomena is difficult, and approaches to disambiguate the two have been primarily limited to longitudinal data analyses. In this study, we provide sufficient conditions for valid estimation of social influence through cross-sectional data, leading to a novel homophily-adjusted social influ…
▽ More
Homophily and social influence are two key concepts of social network analysis. Distinguishing between these phenomena is difficult, and approaches to disambiguate the two have been primarily limited to longitudinal data analyses. In this study, we provide sufficient conditions for valid estimation of social influence through cross-sectional data, leading to a novel homophily-adjusted social influence model which addresses the backdoor pathway of latent homophilic features. The oft-used network autocorrelation model (NAM) is the special case of our proposed model with no latent homophily, suggesting that the NAM is only valid when all homophilic attributes are observed. We conducted an extensive simulation study to evaluate the performance of our proposed homophily-adjusted model, comparing its results with those from the conventional NAM. Our findings shed light on the nuanced dynamics of social networks, presenting a valuable tool for researchers seeking to estimate the effects of social influence while accounting for homophily. Code to implement our approach is available at https://github.com/hanhtdpham/hanam.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Improving Graph Convolutional Networks with Transformer Layer in social-based items recommendation
Authors:
Thi Linh Hoang,
Tuan Dung Pham,
Viet Cuong Ta
Abstract:
In this work, we have proposed an approach for improving the GCN for predicting ratings in social networks. Our model is expanded from the standard model with several layers of transformer architecture. The main focus of the paper is on the encoder architecture for node embedding in the network. Using the embedding layer from the graph-based convolution layer, the attention mechanism could rearran…
▽ More
In this work, we have proposed an approach for improving the GCN for predicting ratings in social networks. Our model is expanded from the standard model with several layers of transformer architecture. The main focus of the paper is on the encoder architecture for node embedding in the network. Using the embedding layer from the graph-based convolution layer, the attention mechanism could rearrange the feature space to get a more efficient embedding for the downstream task. The experiments showed that our proposed architecture achieves better performance than GCN on the traditional link prediction task.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
Rapid Design of Top-Performing Metal-Organic Frameworks with Qualitative Representations of Building Blocks
Authors:
Yigitcan Comlek,
Thang Duc Pham,
Randall Snurr,
Wei Chen
Abstract:
Data-driven materials design often encounters challenges where systems require or possess qualitative (categorical) information. Metal-organic frameworks (MOFs) are an example of such material systems. The representation of MOFs through different building blocks makes it a challenge for designers to incorporate qualitative information into design optimization. Furthermore, the large number of pote…
▽ More
Data-driven materials design often encounters challenges where systems require or possess qualitative (categorical) information. Metal-organic frameworks (MOFs) are an example of such material systems. The representation of MOFs through different building blocks makes it a challenge for designers to incorporate qualitative information into design optimization. Furthermore, the large number of potential building blocks leads to a combinatorial challenge, with millions of possible MOFs that could be explored through time consuming physics-based approaches. In this work, we integrated Latent Variable Gaussian Process (LVGP) and Multi-Objective Batch-Bayesian Optimization (MOBBO) to identify top-performing MOFs adaptively, autonomously, and efficiently without any human intervention. Our approach provides three main advantages: (i) no specific physical descriptors are required and only building blocks that construct the MOFs are used in global optimization through qualitative representations, (ii) the method is application and property independent, and (iii) the latent variable approach provides an interpretable model of qualitative building blocks with physical justification. To demonstrate the effectiveness of our method, we considered a design space with more than 47,000 MOF candidates. By searching only ~1% of the design space, LVGP-MOBBO was able to identify all MOFs on the Pareto front and more than 97% of the 50 top-performing designs for the CO$_2$ working capacity and CO$_2$/N$_2$ selectivity properties. Finally, we compared our approach with the Random Forest algorithm and demonstrated its efficiency, interpretability, and robustness.
△ Less
Submitted 17 February, 2023;
originally announced February 2023.
-
A conditioned local limit theorem for non-negative random matrices
Authors:
M. Peigné,
Thi da Cam Pham
Abstract:
Let $(S_n)_n$ be the random process on $\mathbb R$ driven by the product of i.i.d. non-negative random matrices and $τ$ its exit time from $]0, +\infty[$. By using the adapted strategy initiated by D. Denisov and V. Wachtel, we obtain an asymptotic estimate and bounds of the probability that the process $(S_k)_k$ remains non negative up to time $n$ and simultaneously belongs to some compact set…
▽ More
Let $(S_n)_n$ be the random process on $\mathbb R$ driven by the product of i.i.d. non-negative random matrices and $τ$ its exit time from $]0, +\infty[$. By using the adapted strategy initiated by D. Denisov and V. Wachtel, we obtain an asymptotic estimate and bounds of the probability that the process $(S_k)_k$ remains non negative up to time $n$ and simultaneously belongs to some compact set $[b, b+\ell ]\subset \mathbb R^{*+}$ at time $n$.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
Multi-fidelity Gaussian Process for Biomanufacturing Process Modeling with Small Data
Authors:
Yuan Sun,
Winton Nathan-Roberts,
Tien Dung Pham,
Ellen Otte,
Uwe Aickelin
Abstract:
In biomanufacturing, develo** an accurate model to simulate the complex dynamics of bioprocesses is an important yet challenging task. This is partially due to the uncertainty associated with bioprocesses, high data acquisition cost, and lack of data availability to learn complex relations in bioprocesses. To deal with these challenges, we propose to use a statistical machine learning approach,…
▽ More
In biomanufacturing, develo** an accurate model to simulate the complex dynamics of bioprocesses is an important yet challenging task. This is partially due to the uncertainty associated with bioprocesses, high data acquisition cost, and lack of data availability to learn complex relations in bioprocesses. To deal with these challenges, we propose to use a statistical machine learning approach, multi-fidelity Gaussian process, for process modelling in biomanufacturing. Gaussian process regression is a well-established technique based on probability theory which can naturally consider uncertainty in a dataset via Gaussian noise, and multi-fidelity techniques can make use of multiple sources of information with different levels of fidelity, thus suitable for bioprocess modeling with small data. We apply the multi-fidelity Gaussian process to solve two significant problems in biomanufacturing, bioreactor scale-up and knowledge transfer across cell lines, and demonstrate its efficacy on real-world datasets.
△ Less
Submitted 26 November, 2022;
originally announced November 2022.
-
An FPGA-based Solution for Convolution Operation Acceleration
Authors:
Trung Dinh Pham,
Bao Gia Bach,
Lam Trinh Luu,
Minh Dinh Nguyen,
Hai Duc Pham,
Khoa Bui Anh,
Xuan Quang Nguyen,
Cuong Pham Quoc
Abstract:
Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive computing step that appears in many Convolutional Neural Network models. We target the design to the standard convolution operation, intending to launch the product a…
▽ More
Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive computing step that appears in many Convolutional Neural Network models. We target the design to the standard convolution operation, intending to launch the product as an edge-AI solution. The project's purpose is to produce an FPGA IP core that can process a convolutional layer at a time. System developers can deploy the IP core with various FPGA families by using Verilog HDL as the primary design language for the architecture. The experimental results show that our single computing core synthesized on a simple edge computing FPGA board can offer 0.224 GOPS. When the board is fully utilized, 4.48 GOPS can be achieved.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Stegomalware: A Systematic Survey of MalwareHiding and Detection in Images, Machine LearningModels and Research Challenges
Authors:
Rajasekhar Chaganti,
Vinayakumar Ravi,
Mamoun Alazab,
Tuan D. Pham
Abstract:
Malware distribution to the victim network is commonly performed through file attachments in phishing email or from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage techniques such as signature or anomaly-based, machine learning techniques. The well-…
▽ More
Malware distribution to the victim network is commonly performed through file attachments in phishing email or from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage techniques such as signature or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis, and the malware detection capabilities of these files has been well advanced for real-time detection. But the malware payload hiding in multimedia using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning(DL) models for hiding data detection. Additionally, the stegomalware detection framework for enterprise is proposed for anomaly based stegomalware detection emphasizing the architecture details for different network environments. Finally, the research opportunities and challenges in stegomalware generation and detection are also presented.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
On the affine recursion on $\mathbb R_+^d$
Authors:
Sara Brofferio,
Marc Peigné,
Thi Da Cam Pham
Abstract:
We fix $d \geq 2$ and denote $\mathcal S$ the semi-group of $d \times d$ matrices with non negative entries. We consider a sequence $(A_n, B_n)_{n \geq 1} $ of i. i. d. random variables with values in $\mathcal S\times \mathbb R_+^d$ and study the asymptotic behavior of the Markov chain $(X_n)_{n \geq 0}$ on $ \mathbb R_+^d$ defined by:
\[
\forall n \geq 0, \qquad X_{n+1}=A_{n+1}X_n+B_{n+1},…
▽ More
We fix $d \geq 2$ and denote $\mathcal S$ the semi-group of $d \times d$ matrices with non negative entries. We consider a sequence $(A_n, B_n)_{n \geq 1} $ of i. i. d. random variables with values in $\mathcal S\times \mathbb R_+^d$ and study the asymptotic behavior of the Markov chain $(X_n)_{n \geq 0}$ on $ \mathbb R_+^d$ defined by:
\[
\forall n \geq 0, \qquad X_{n+1}=A_{n+1}X_n+B_{n+1},
\] where $X_0$ is a fixed random variable. We assume that the Lyapunov exponent of the matrices $A_n$ equals $0$ and prove, under quite general hypotheses, that there exists a unique (infinite) Radon measure $λ$ on $(\mathbb R^+)^d$ which is invariant for the chain $(X_n)_{n \geq 0}$. The existence of $λ$ relies on a recent work by T.D.C. Pham about fluctuations of the norm of product of random matrices . Its unicity is a consequence of a general property, called "local contractivity", highlighted about 20 years ago by M. Babillot, Ph. Bougerol et L. Elie in the case of the one dimensional affine recursion .
△ Less
Submitted 20 March, 2020;
originally announced March 2020.
-
Critical exponent for a weakly coupled system of semi-linear $σ$-evolution equations with frictional dam**
Authors:
Tuan Anh Dao,
Trieu Duong Pham
Abstract:
We are interested in studying the Cauchy problem for a weakly coupled system of semi-linear $σ$-evolution equations with frictional dam**. The main purpose of this paper is two-fold. We would like to not only prove the global (in time) existence of small data energy solutions but also indicate the blow-up result for Sobolev solutions when $σ$ is assumed to be any fractional number.
We are interested in studying the Cauchy problem for a weakly coupled system of semi-linear $σ$-evolution equations with frictional dam**. The main purpose of this paper is two-fold. We would like to not only prove the global (in time) existence of small data energy solutions but also indicate the blow-up result for Sobolev solutions when $σ$ is assumed to be any fractional number.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
DUNet: A deformable network for retinal vessel segmentation
Authors:
Qiangguo **,
Zhaopeng Meng,
Tuan D. Pham,
Qi Chen,
Leyi Wei,
Ran Su
Abstract:
Automatic segmentation of retinal vessels in fundus images plays an important role in the diagnosis of some diseases such as diabetes and hypertension. In this paper, we propose Deformable U-Net (DUNet), which exploits the retinal vessels' local features with a U-shape architecture, in an end to end manner for retinal vessel segmentation. Inspired by the recently introduced deformable convolutiona…
▽ More
Automatic segmentation of retinal vessels in fundus images plays an important role in the diagnosis of some diseases such as diabetes and hypertension. In this paper, we propose Deformable U-Net (DUNet), which exploits the retinal vessels' local features with a U-shape architecture, in an end to end manner for retinal vessel segmentation. Inspired by the recently introduced deformable convolutional networks, we integrate the deformable convolution into the proposed network. The DUNet, with upsampling operators to increase the output resolution, is designed to extract context information and enable precise localization by combining low-level feature maps with high-level ones. Furthermore, DUNet captures the retinal vessels at various shapes and scales by adaptively adjusting the receptive fields according to vessels' scales and shapes. Three public datasets DRIVE, STARE and CHASE_DB1 are used to train and test our model. Detailed comparisons between the proposed network and the deformable neural network, U-Net are provided in our study. Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0.9697/0.9722/0.9724 and AUC of 0.9856/0.9868/0.9863 on DRIVE, STARE and CHASE_DB1 respectively. Moreover, to show the generalization ability of the DUNet, we used another two retinal vessel data sets, one is named WIDE and the other is a synthetic data set with diverse styles, named SYNTHE, to qualitatively and quantitatively analyzed and compared with other methods. Results indicates that DUNet outperforms other state-of-the-arts.
△ Less
Submitted 3 November, 2018;
originally announced November 2018.
-
Strongly elliptic pseudodifferential equations on the sphere with radial basis functions
Authors:
T. D. Pham,
T. Tran
Abstract:
Spherical radial basis functions are used to define approximate solutions to strongly elliptic pseudodifferential equations on the unit sphere. These equations arise from geodesy. The approximate solutions are found by the Galerkin and collocation methods. A salient feature of the paper is a {\em unified theory} for error analysis of both approximation methods.
Spherical radial basis functions are used to define approximate solutions to strongly elliptic pseudodifferential equations on the unit sphere. These equations arise from geodesy. The approximate solutions are found by the Galerkin and collocation methods. A salient feature of the paper is a {\em unified theory} for error analysis of both approximation methods.
△ Less
Submitted 26 November, 2013;
originally announced November 2013.
-
Solvable quadratic Lie algebras in low dimensions
Authors:
Tien Dat Pham,
Anh Vu Le,
Minh Thanh Duong
Abstract:
In this paper, we classify solvable quadratic Lie algebras up to dimension 6. In dimensions smaller than 6, we use the Witt decomposition given in \cite{Bou59} and a result in \cite{PU07} to obtain two non-Abelian indecomposable solvable quadratic Lie algebras. In the case of dimension 6, by applying the method of double extension given in \cite{Kac85} and \cite{MR85} and the classification result…
▽ More
In this paper, we classify solvable quadratic Lie algebras up to dimension 6. In dimensions smaller than 6, we use the Witt decomposition given in \cite{Bou59} and a result in \cite{PU07} to obtain two non-Abelian indecomposable solvable quadratic Lie algebras. In the case of dimension 6, by applying the method of double extension given in \cite{Kac85} and \cite{MR85} and the classification result of singular quadratic Lie algebras in \cite{DPU}, we have three families of solvable quadratic Lie algebras which are indecomposable and not isomorphic.
△ Less
Submitted 21 April, 2012;
originally announced April 2012.