Search | arXiv e-print repository

Homophily-adjusted social influence estimation

Authors: Hanh T. D. Pham, Daniel K. Sewell

Abstract: Homophily and social influence are two key concepts of social network analysis. Distinguishing between these phenomena is difficult, and approaches to disambiguate the two have been primarily limited to longitudinal data analyses. In this study, we provide sufficient conditions for valid estimation of social influence through cross-sectional data, leading to a novel homophily-adjusted social influ… ▽ More Homophily and social influence are two key concepts of social network analysis. Distinguishing between these phenomena is difficult, and approaches to disambiguate the two have been primarily limited to longitudinal data analyses. In this study, we provide sufficient conditions for valid estimation of social influence through cross-sectional data, leading to a novel homophily-adjusted social influence model which addresses the backdoor pathway of latent homophilic features. The oft-used network autocorrelation model (NAM) is the special case of our proposed model with no latent homophily, suggesting that the NAM is only valid when all homophilic attributes are observed. We conducted an extensive simulation study to evaluate the performance of our proposed homophily-adjusted model, comparing its results with those from the conventional NAM. Our findings shed light on the nuanced dynamics of social networks, presenting a valuable tool for researchers seeking to estimate the effects of social influence while accounting for homophily. Code to implement our approach is available at https://github.com/hanhtdpham/hanam. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2401.06436 [pdf, other]

doi 10.1109/KSE53942.2021.9648823

Improving Graph Convolutional Networks with Transformer Layer in social-based items recommendation

Authors: Thi Linh Hoang, Tuan Dung Pham, Viet Cuong Ta

Abstract: In this work, we have proposed an approach for improving the GCN for predicting ratings in social networks. Our model is expanded from the standard model with several layers of transformer architecture. The main focus of the paper is on the encoder architecture for node embedding in the network. Using the embedding layer from the graph-based convolution layer, the attention mechanism could rearran… ▽ More In this work, we have proposed an approach for improving the GCN for predicting ratings in social networks. Our model is expanded from the standard model with several layers of transformer architecture. The main focus of the paper is on the encoder architecture for node embedding in the network. Using the embedding layer from the graph-based convolution layer, the attention mechanism could rearrange the feature space to get a more efficient embedding for the downstream task. The experiments showed that our proposed architecture achieves better performance than GCN on the traditional link prediction task. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2302.09184 [pdf]

doi 10.1038/s41524-023-01125-1

Rapid Design of Top-Performing Metal-Organic Frameworks with Qualitative Representations of Building Blocks

Authors: Yigitcan Comlek, Thang Duc Pham, Randall Snurr, Wei Chen

Abstract: Data-driven materials design often encounters challenges where systems require or possess qualitative (categorical) information. Metal-organic frameworks (MOFs) are an example of such material systems. The representation of MOFs through different building blocks makes it a challenge for designers to incorporate qualitative information into design optimization. Furthermore, the large number of pote… ▽ More Data-driven materials design often encounters challenges where systems require or possess qualitative (categorical) information. Metal-organic frameworks (MOFs) are an example of such material systems. The representation of MOFs through different building blocks makes it a challenge for designers to incorporate qualitative information into design optimization. Furthermore, the large number of potential building blocks leads to a combinatorial challenge, with millions of possible MOFs that could be explored through time consuming physics-based approaches. In this work, we integrated Latent Variable Gaussian Process (LVGP) and Multi-Objective Batch-Bayesian Optimization (MOBBO) to identify top-performing MOFs adaptively, autonomously, and efficiently without any human intervention. Our approach provides three main advantages: (i) no specific physical descriptors are required and only building blocks that construct the MOFs are used in global optimization through qualitative representations, (ii) the method is application and property independent, and (iii) the latent variable approach provides an interpretable model of qualitative building blocks with physical justification. To demonstrate the effectiveness of our method, we considered a design space with more than 47,000 MOF candidates. By searching only ~1% of the design space, LVGP-MOBBO was able to identify all MOFs on the Pareto front and more than 97% of the 50 top-performing designs for the CO$_2$ working capacity and CO$_2$/N$_2$ selectivity properties. Finally, we compared our approach with the Random Forest algorithm and demonstrated its efficiency, interpretability, and robustness. △ Less

Submitted 17 February, 2023; originally announced February 2023.

Comments: 35 pages total. First 29 pages belong to the main manuscript and the remaining 6 six are for the supplementary information, 13 figures total. 9 figures are on the main manuscript and 4 figures are in the supplementary information. 1 table in the supplementary information

arXiv:2301.06929 [pdf, ps, other]

A conditioned local limit theorem for non-negative random matrices

Authors: M. Peigné, Thi da Cam Pham

Abstract: Let $(S_n)_n$ be the random process on $\mathbb R$ driven by the product of i.i.d. non-negative random matrices and $τ$ its exit time from $]0, +\infty[$. By using the adapted strategy initiated by D. Denisov and V. Wachtel, we obtain an asymptotic estimate and bounds of the probability that the process $(S_k)_k$ remains non negative up to time $n$ and simultaneously belongs to some compact set… ▽ More Let $(S_n)_n$ be the random process on $\mathbb R$ driven by the product of i.i.d. non-negative random matrices and $τ$ its exit time from $]0, +\infty[$. By using the adapted strategy initiated by D. Denisov and V. Wachtel, we obtain an asymptotic estimate and bounds of the probability that the process $(S_k)_k$ remains non negative up to time $n$ and simultaneously belongs to some compact set $[b, b+\ell ]\subset \mathbb R^{*+}$ at time $n$. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2211.14493 [pdf, other]

Multi-fidelity Gaussian Process for Biomanufacturing Process Modeling with Small Data

Authors: Yuan Sun, Winton Nathan-Roberts, Tien Dung Pham, Ellen Otte, Uwe Aickelin

Abstract: In biomanufacturing, develo** an accurate model to simulate the complex dynamics of bioprocesses is an important yet challenging task. This is partially due to the uncertainty associated with bioprocesses, high data acquisition cost, and lack of data availability to learn complex relations in bioprocesses. To deal with these challenges, we propose to use a statistical machine learning approach,… ▽ More In biomanufacturing, develo** an accurate model to simulate the complex dynamics of bioprocesses is an important yet challenging task. This is partially due to the uncertainty associated with bioprocesses, high data acquisition cost, and lack of data availability to learn complex relations in bioprocesses. To deal with these challenges, we propose to use a statistical machine learning approach, multi-fidelity Gaussian process, for process modelling in biomanufacturing. Gaussian process regression is a well-established technique based on probability theory which can naturally consider uncertainty in a dataset via Gaussian noise, and multi-fidelity techniques can make use of multiple sources of information with different levels of fidelity, thus suitable for bioprocess modeling with small data. We apply the multi-fidelity Gaussian process to solve two significant problems in biomanufacturing, bioreactor scale-up and knowledge transfer across cell lines, and demonstrate its efficacy on real-world datasets. △ Less

Submitted 26 November, 2022; originally announced November 2022.

arXiv:2206.04520 [pdf, other]

doi 10.1007/978-3-031-15063-0_26

An FPGA-based Solution for Convolution Operation Acceleration

Authors: Trung Dinh Pham, Bao Gia Bach, Lam Trinh Luu, Minh Dinh Nguyen, Hai Duc Pham, Khoa Bui Anh, Xuan Quang Nguyen, Cuong Pham Quoc

Abstract: Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive computing step that appears in many Convolutional Neural Network models. We target the design to the standard convolution operation, intending to launch the product a… ▽ More Hardware-based acceleration is an extensive attempt to facilitate many computationally-intensive mathematics operations. This paper proposes an FPGA-based architecture to accelerate the convolution operation - a complex and expensive computing step that appears in many Convolutional Neural Network models. We target the design to the standard convolution operation, intending to launch the product as an edge-AI solution. The project's purpose is to produce an FPGA IP core that can process a convolutional layer at a time. System developers can deploy the IP core with various FPGA families by using Verilog HDL as the primary design language for the architecture. The experimental results show that our single computing core synthesized on a simple edge computing FPGA board can offer 0.224 GOPS. When the board is fully utilized, 4.48 GOPS can be achieved. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: 11 pages, 6 figures, accepted to The First International Conference on Intelligence of Things (ICIT 2022)

Journal ref: Lecture Notes on Data Engineering and Communications Technologies, vol 148. Springer, 2022,

arXiv:2110.02504 [pdf, other]

Stegomalware: A Systematic Survey of MalwareHiding and Detection in Images, Machine LearningModels and Research Challenges

Authors: Rajasekhar Chaganti, Vinayakumar Ravi, Mamoun Alazab, Tuan D. Pham

Abstract: Malware distribution to the victim network is commonly performed through file attachments in phishing email or from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage techniques such as signature or anomaly-based, machine learning techniques. The well-… ▽ More Malware distribution to the victim network is commonly performed through file attachments in phishing email or from the internet, when the victim interacts with the source of infection. To detect and prevent the malware distribution in the victim machine, the existing end device security applications may leverage techniques such as signature or anomaly-based, machine learning techniques. The well-known file formats Portable Executable (PE) for Windows and Executable and Linkable Format (ELF) for Linux based operating system are used for malware analysis, and the malware detection capabilities of these files has been well advanced for real-time detection. But the malware payload hiding in multimedia using steganography detection has been a challenge for enterprises, as these are rarely seen and usually act as a stager in sophisticated attacks. In this article, to our knowledge, we are the first to try to address the knowledge gap between the current progress in image steganography and steganalysis academic research focusing on data hiding and the review of the stegomalware (malware payload hiding in images) targeting enterprises with cyberattacks current status. We present the stegomalware history, generation tools, file format specification description. Based on our findings, we perform the detail review of the image steganography techniques including the recent Generative Adversarial Networks (GAN) based models and the image steganalysis methods including the Deep Learning(DL) models for hiding data detection. Additionally, the stegomalware detection framework for enterprise is proposed for anomaly based stegomalware detection emphasizing the architecture details for different network environments. Finally, the research opportunities and challenges in stegomalware generation and detection are also presented. △ Less

Submitted 6 October, 2021; originally announced October 2021.

arXiv:2003.09380 [pdf, ps, other]

On the affine recursion on $\mathbb R_+^d$

Authors: Sara Brofferio, Marc Peigné, Thi Da Cam Pham

Abstract: We fix $d \geq 2$ and denote $\mathcal S$ the semi-group of $d \times d$ matrices with non negative entries. We consider a sequence $(A_n, B_n)_{n \geq 1} $ of i. i. d. random variables with values in $\mathcal S\times \mathbb R_+^d$ and study the asymptotic behavior of the Markov chain $(X_n)_{n \geq 0}$ on $ \mathbb R_+^d$ defined by: \[ \forall n \geq 0, \qquad X_{n+1}=A_{n+1}X_n+B_{n+1},… ▽ More We fix $d \geq 2$ and denote $\mathcal S$ the semi-group of $d \times d$ matrices with non negative entries. We consider a sequence $(A_n, B_n)_{n \geq 1} $ of i. i. d. random variables with values in $\mathcal S\times \mathbb R_+^d$ and study the asymptotic behavior of the Markov chain $(X_n)_{n \geq 0}$ on $ \mathbb R_+^d$ defined by: \[ \forall n \geq 0, \qquad X_{n+1}=A_{n+1}X_n+B_{n+1}, \] where $X_0$ is a fixed random variable. We assume that the Lyapunov exponent of the matrices $A_n$ equals $0$ and prove, under quite general hypotheses, that there exists a unique (infinite) Radon measure $λ$ on $(\mathbb R^+)^d$ which is invariant for the chain $(X_n)_{n \geq 0}$. The existence of $λ$ relies on a recent work by T.D.C. Pham about fluctuations of the norm of product of random matrices . Its unicity is a consequence of a general property, called "local contractivity", highlighted about 20 years ago by M. Babillot, Ph. Bougerol et L. Elie in the case of the one dimensional affine recursion . △ Less

Submitted 20 March, 2020; originally announced March 2020.

MSC Class: 60J80 (Primary) 60F17; 60K37 (Secondary)

arXiv:1911.01946 [pdf, ps, other]

doi 10.1216/rmj.2022.52.299

Critical exponent for a weakly coupled system of semi-linear $σ$-evolution equations with frictional dam**

Authors: Tuan Anh Dao, Trieu Duong Pham

Abstract: We are interested in studying the Cauchy problem for a weakly coupled system of semi-linear $σ$-evolution equations with frictional dam**. The main purpose of this paper is two-fold. We would like to not only prove the global (in time) existence of small data energy solutions but also indicate the blow-up result for Sobolev solutions when $σ$ is assumed to be any fractional number. We are interested in studying the Cauchy problem for a weakly coupled system of semi-linear $σ$-evolution equations with frictional dam**. The main purpose of this paper is two-fold. We would like to not only prove the global (in time) existence of small data energy solutions but also indicate the blow-up result for Sobolev solutions when $σ$ is assumed to be any fractional number. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: 18 pages

MSC Class: 35B33; 35L56; 35S05

Journal ref: Rocky Mountain J. Math. 52 (2022) 299-321

arXiv:1811.01206 [pdf, other]

doi 10.1016/j.knosys.2019.04.025

DUNet: A deformable network for retinal vessel segmentation

Authors: Qiangguo **, Zhaopeng Meng, Tuan D. Pham, Qi Chen, Leyi Wei, Ran Su

Abstract: Automatic segmentation of retinal vessels in fundus images plays an important role in the diagnosis of some diseases such as diabetes and hypertension. In this paper, we propose Deformable U-Net (DUNet), which exploits the retinal vessels' local features with a U-shape architecture, in an end to end manner for retinal vessel segmentation. Inspired by the recently introduced deformable convolutiona… ▽ More Automatic segmentation of retinal vessels in fundus images plays an important role in the diagnosis of some diseases such as diabetes and hypertension. In this paper, we propose Deformable U-Net (DUNet), which exploits the retinal vessels' local features with a U-shape architecture, in an end to end manner for retinal vessel segmentation. Inspired by the recently introduced deformable convolutional networks, we integrate the deformable convolution into the proposed network. The DUNet, with upsampling operators to increase the output resolution, is designed to extract context information and enable precise localization by combining low-level feature maps with high-level ones. Furthermore, DUNet captures the retinal vessels at various shapes and scales by adaptively adjusting the receptive fields according to vessels' scales and shapes. Three public datasets DRIVE, STARE and CHASE_DB1 are used to train and test our model. Detailed comparisons between the proposed network and the deformable neural network, U-Net are provided in our study. Results show that more detailed vessels are extracted by DUNet and it exhibits state-of-the-art performance for retinal vessel segmentation with a global accuracy of 0.9697/0.9722/0.9724 and AUC of 0.9856/0.9868/0.9863 on DRIVE, STARE and CHASE_DB1 respectively. Moreover, to show the generalization ability of the DUNet, we used another two retinal vessel data sets, one is named WIDE and the other is a synthetic data set with diverse styles, named SYNTHE, to qualitatively and quantitatively analyzed and compared with other methods. Results indicates that DUNet outperforms other state-of-the-arts. △ Less

Submitted 3 November, 2018; originally announced November 2018.

arXiv:1311.6580 [pdf, ps, other]

Strongly elliptic pseudodifferential equations on the sphere with radial basis functions

Authors: T. D. Pham, T. Tran

Abstract: Spherical radial basis functions are used to define approximate solutions to strongly elliptic pseudodifferential equations on the unit sphere. These equations arise from geodesy. The approximate solutions are found by the Galerkin and collocation methods. A salient feature of the paper is a {\em unified theory} for error analysis of both approximation methods. Spherical radial basis functions are used to define approximate solutions to strongly elliptic pseudodifferential equations on the unit sphere. These equations arise from geodesy. The approximate solutions are found by the Galerkin and collocation methods. A salient feature of the paper is a {\em unified theory} for error analysis of both approximation methods. △ Less

Submitted 26 November, 2013; originally announced November 2013.

arXiv:1204.4787 [pdf, ps, other]

Solvable quadratic Lie algebras in low dimensions

Authors: Tien Dat Pham, Anh Vu Le, Minh Thanh Duong

Abstract: In this paper, we classify solvable quadratic Lie algebras up to dimension 6. In dimensions smaller than 6, we use the Witt decomposition given in \cite{Bou59} and a result in \cite{PU07} to obtain two non-Abelian indecomposable solvable quadratic Lie algebras. In the case of dimension 6, by applying the method of double extension given in \cite{Kac85} and \cite{MR85} and the classification result… ▽ More In this paper, we classify solvable quadratic Lie algebras up to dimension 6. In dimensions smaller than 6, we use the Witt decomposition given in \cite{Bou59} and a result in \cite{PU07} to obtain two non-Abelian indecomposable solvable quadratic Lie algebras. In the case of dimension 6, by applying the method of double extension given in \cite{Kac85} and \cite{MR85} and the classification result of singular quadratic Lie algebras in \cite{DPU}, we have three families of solvable quadratic Lie algebras which are indecomposable and not isomorphic. △ Less

Submitted 21 April, 2012; originally announced April 2012.

Comments: 10 pages

MSC Class: 17B05; 17B30; 17B40

Showing 1–12 of 12 results for author: Pham, T D