SaC2Vec: Information Network Representation with Structure and Content
Authors:
Sambaran Bandyopadhyay,
Harsh Kara,
Anirban Biswas,
M N Murty
Abstract:
Network representation learning (also known as information network embedding) has been the central piece of research in social and information network analysis for the last couple of years. An information network can be viewed as a linked structure of a set of entities. A set of linked web pages and documents, a set of users in a social network are common examples of information network. Network e…
▽ More
Network representation learning (also known as information network embedding) has been the central piece of research in social and information network analysis for the last couple of years. An information network can be viewed as a linked structure of a set of entities. A set of linked web pages and documents, a set of users in a social network are common examples of information network. Network embedding learns low dimensional representations of the nodes, which can further be used for downstream network mining applications such as community detection or node clustering. Information network representation techniques traditionally use only the link structure of the network. But in real world networks, nodes come with additional content such as textual descriptions or associated images. This content is semantically correlated with the network structure and hence using the content along with the topological structure of the network can facilitate the overall network representation. In this paper, we propose Sac2Vec, a network representation technique that exploits both the structure and content. We convert the network into a multi-layered graph and use random walk and language modeling technique to generate the embedding of the nodes. Our approach is simple and computationally fast, yet able to use the content as a complement to structure and vice-versa. We also generalize the approach for networks having multiple types of content in each node. Experimental evaluations on four real world publicly available datasets show the merit of our approach compared to state-of-the-art algorithms in the domain.
△ Less
Submitted 4 July, 2018; v1 submitted 27 April, 2018;
originally announced April 2018.
FSCNMF: Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks
Authors:
Sambaran Bandyopadhyay,
Harsh Kara,
Aswin Kannan,
M N Murty
Abstract:
Analysis and visualization of an information network can be facilitated better using an appropriate embedding of the network. Network embedding learns a compact low-dimensional vector representation for each node of the network, and uses this lower dimensional representation for different network analysis tasks. Only the structure of the network is considered by a majority of the current embedding…
▽ More
Analysis and visualization of an information network can be facilitated better using an appropriate embedding of the network. Network embedding learns a compact low-dimensional vector representation for each node of the network, and uses this lower dimensional representation for different network analysis tasks. Only the structure of the network is considered by a majority of the current embedding algorithms. However, some content is associated with each node, in most of the practical applications, which can help to understand the underlying semantics of the network. It is not straightforward to integrate the content of each node in the current state-of-the-art network embedding methods.
In this paper, we propose a nonnegative matrix factorization based optimization framework, namely FSCNMF which considers both the network structure and the content of the nodes while learning a lower dimensional representation of each node in the network. Our approach systematically regularizes structure based on content and vice versa to exploit the consistency between the structure and content to the best possible extent. We further extend the basic FSCNMF to an advanced method, namely FSCNMF++ to capture the higher order proximities in the network. We conduct experiments on real world information networks for different types of machine learning applications such as node clustering, visualization, and multi-class classification. The results show that our method can represent the network significantly better than the state-of-the-art algorithms and improve the performance across all the applications that we consider.
△ Less
Submitted 4 July, 2018; v1 submitted 15 April, 2018;
originally announced April 2018.