-
Interest Maximization in Social Networks
Authors:
Rahul Kumar Gautam,
Anjeneya Swami Kare,
S. Durga Bhavani
Abstract:
Nowadays, organizations use viral marketing strategies to promote their products through social networks. It is expensive to directly send the product promotional information to all the users in the network. In this context, Kempe et al. \cite{kempe2003maximizing} introduced the Influence Maximization (IM) problem, which identifies $k$ most influential nodes (spreader nodes), such that the maximum…
▽ More
Nowadays, organizations use viral marketing strategies to promote their products through social networks. It is expensive to directly send the product promotional information to all the users in the network. In this context, Kempe et al. \cite{kempe2003maximizing} introduced the Influence Maximization (IM) problem, which identifies $k$ most influential nodes (spreader nodes), such that the maximum number of people in the network adopts the promotional message.
Many variants of the IM problem have been studied in the literature, namely, Perfect Evangelising Set (PES), Perfect Awareness Problem (PAP), etc. In this work, we propose a maximization version of PAP called the \IM{} problem. Different people have different levels of interest in a particular product. This is modeled by assigning an interest value to each node in the network. Then, the problem is to select $k$ initial spreaders such that the sum of the interest values of the people (nodes) who become aware of the message is maximized.
We study the \IM{} problem under two popular diffusion models: the Linear Threshold Model (LTM) and the Independent Cascade Model (ICM). We show that the \IM{} problem is NP-Hard under LTM. We give linear programming formulation for the problem under LTM. We propose four heuristic algorithms for the \IM{} problem: \LBE{} (\LB{}), Maximum Degree First Heuristic (\MD{}), \PBE{} (\PB{}), and Maximum Profit Based Greedy Heuristic (\MP{}). Extensive experimentation has been carried out on many real-world benchmark data sets for both diffusion models. The results show that among the proposed heuristics, \MP{} performs better in maximizing the interest value.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Approximation Algorithms for the Graph Burning on Cactus and Directed Trees
Authors:
Rahul Kumar Gautam,
Anjeneya Swami Kare,
S. Durga Bhavani
Abstract:
Given a graph $G=(V, E)$, the problem of Graph Burning is to find a sequence of nodes from $V$, called a burning sequence, to burn the whole graph. This is a discrete-step process, and at each step, an unburned vertex is selected as an agent to spread fire to its neighbors by marking it as a burnt node. A burnt node spreads the fire to its neighbors at the next consecutive step. The goal is to fin…
▽ More
Given a graph $G=(V, E)$, the problem of Graph Burning is to find a sequence of nodes from $V$, called a burning sequence, to burn the whole graph. This is a discrete-step process, and at each step, an unburned vertex is selected as an agent to spread fire to its neighbors by marking it as a burnt node. A burnt node spreads the fire to its neighbors at the next consecutive step. The goal is to find the burning sequence of minimum length. The Graph Burning problem is NP-Hard for general graphs and even for binary trees. A few approximation results are known, including a $ 3$-approximation algorithm for general graphs and a $ 2$-approximation algorithm for trees.
The Graph Burning on directed graphs is more challenging than on undirected graphs. In this paper, we propose 1) A $2.75$-approximation algorithm for a cactus graph (undirected), 2) A $3$-approximation algorithm for multi-rooted directed trees (polytree) and 3) A $1.905$-approximation algorithm for single-rooted directed tree (arborescence). We implement all the three approximation algorithms and the results are shown for randomly generated cactus graphs and directed trees.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning
Authors:
Yifei Ren,
Jian Lou,
Li Xiong,
Joyce C Ho,
Xiaoqian Jiang,
Sivasubramanium Bhavani
Abstract:
Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications such as recommender systems and Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different users in recommender systems or pa…
▽ More
Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications such as recommender systems and Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different users in recommender systems or patients in EHRs may have different length of records. PARAFAC2 has been successfully applied on EHRs for extracting meaningful medical concepts (phenotypes). Despite recent advancements, current models' predictability and interpretability are not satisfactory, which limits its utility for downstream analysis. In this paper, we propose MULTIPAR: a supervised irregular tensor factorization with multi-task learning. MULTIPAR is flexible to incorporate both static (e.g. in-hospital mortality prediction) and continuous or dynamic (e.g. the need for ventilation) tasks. By supervising the tensor factorization with downstream prediction tasks and leveraging information from multiple related predictive tasks, MULTIPAR can yield not only more meaningful phenotypes but also better predictive performance for downstream tasks. We conduct extensive experiments on two real-world temporal EHR datasets to demonstrate that MULTIPAR is scalable and achieves better tensor fit with more meaningful subgroups and stronger predictive performance compared to existing state-of-the-art methods.
△ Less
Submitted 9 August, 2022; v1 submitted 1 August, 2022;
originally announced August 2022.
-
Communication Efficient Generalized Tensor Factorization for Decentralized Healthcare Networks
Authors:
**g Ma,
Qiuchen Zhang,
Jian Lou,
Li Xiong,
Sivasubramanium Bhavani,
Joyce C. Ho
Abstract:
Tensor factorization has been proved as an efficient unsupervised learning approach for health data analysis, especially for computational phenoty**, where the high-dimensional Electronic Health Records (EHRs) with patients' history of medical procedures, medications, diagnosis, lab tests, etc., are converted to meaningful and interpretable medical concepts. Federated tensor factorization distri…
▽ More
Tensor factorization has been proved as an efficient unsupervised learning approach for health data analysis, especially for computational phenoty**, where the high-dimensional Electronic Health Records (EHRs) with patients' history of medical procedures, medications, diagnosis, lab tests, etc., are converted to meaningful and interpretable medical concepts. Federated tensor factorization distributes the tensor computation to multiple workers under the coordination of a central server, which enables jointly learning the phenotypes across multiple hospitals while preserving the privacy of the patient information. However, existing federated tensor factorization algorithms encounter the single-point-failure issue with the involvement of the central server, which is not only easily exposed to external attacks but also limits the number of clients sharing information with the server under restricted uplink bandwidth. In this paper, we propose CiderTF, a communication-efficient decentralized generalized tensor factorization, which reduces the uplink communication cost by leveraging a four-level communication reduction strategy designed for a generalized tensor factorization, which has the flexibility of modeling different tensor distribution with multiple kinds of loss functions. Experiments on two real-world EHR datasets demonstrate that CiderTF achieves comparable convergence with a communication reduction up to 99.99%.
△ Less
Submitted 3 November, 2022; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Link Prediction Approach to Recommender Systems
Authors:
T. Jaya Lakshmi,
S. Durga Bhavani
Abstract:
The problem of recommender system is very popular with myriad available solutions. A novel approach that uses the link prediction problem in social networks has been proposed in the literature that model the typical user-item information as a bipartite network in which link prediction would actually mean recommending an item to a user. The standard recommender system methods suffer from the proble…
▽ More
The problem of recommender system is very popular with myriad available solutions. A novel approach that uses the link prediction problem in social networks has been proposed in the literature that model the typical user-item information as a bipartite network in which link prediction would actually mean recommending an item to a user. The standard recommender system methods suffer from the problems of sparsity and scalability. Since link prediction measures involve computations pertaining to small neighborhoods in the network, this approach would lead to a scalable solution to recommendation. One of the issues in this conversion is that link prediction problem is modelled as a binary classification task whereas the problem of recommender systems is solved as a regression task in which the rating of the link is to be predicted. We overcome this issue by predicting top k links as recommendations with high ratings without predicting the actual rating. Our work extends similar approaches in the literature by focusing on exploiting the probabilistic measures for link prediction. Moreover, in the proposed approach, prediction measures that utilize temporal information available on the links prove to be more effective in improving the accuracy of prediction. This approach is evaluated on the benchmark 'Movielens' dataset. We show that the usage of temporal probabilistic measures helps in improving the quality of recommendations. Temporal random-walk based measure T_Flow improves recommendation accuracy by 4% and Temporal cooccurrence probability measure improves prediction accuracy by 10% over item-based collaborative filtering method in terms of AUROC score.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Faster Heuristics for Graph Burning
Authors:
Rahul Kumar Gautam,
Anjeneya Swami Kare,
S. Durga Bhavani
Abstract:
Graph burning is a process of information spreading through the network by an agent in discrete steps. The problem is to find an optimal sequence of nodes which have to be given information so that the network is covered in least number of steps. Graph burning problem is NP-Hard for which two approximation algorithms and a few heuristics have been proposed in the literature. In this work, we propo…
▽ More
Graph burning is a process of information spreading through the network by an agent in discrete steps. The problem is to find an optimal sequence of nodes which have to be given information so that the network is covered in least number of steps. Graph burning problem is NP-Hard for which two approximation algorithms and a few heuristics have been proposed in the literature. In this work, we propose three heuristics, namely, Backbone Based Greedy Heuristic (BBGH), Improved Cutting Corners Heuristic (ICCH) and Component Based Recursive Heuristic (CBRH). These are mainly based on Eigenvector centrality measure. BBGH finds a backbone of the network and picks vertex to be burned greedily from the vertices of the backbone. ICCH is a shortest path based heuristic and picks vertex to burn greedily from best central nodes. The burning number problem on disconnected graphs is harder than on the connected graphs. For example, burning number problem is easy on a path where as it is NP-Hard on disjoint paths. In practice, large networks are generally disconnected and moreover even if the input graph is connected, during the burning process the graph among the unburned vertices may be disconnected. For disconnected graphs, ordering of the components is crucial. Our CBRH works well on disconnected graphs as it prioritizes the components. All the heuristics have been implemented and tested on several bench-mark networks including large networks of size more than $50$K nodes. The experimentation also includes comparison to the approximation algorithms. The advantages of our algorithms are that they are much simpler to implement and also several orders faster than the heuristics proposed in the literature.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.