-
Combating Fake News: A Survey on Identification and Mitigation Techniques
Authors:
Karishma Sharma,
Feng Qian,
He Jiang,
Natali Ruchansky,
Ming Zhang,
Yan Liu
Abstract:
The proliferation of fake news on social media has opened up new directions of research for timely identification and containment of fake news, and mitigation of its widespread impact on public opinion. While much of the earlier research was focused on identification of fake news based on its contents or by exploiting users' engagements with the news on social media, there has been a rising intere…
▽ More
The proliferation of fake news on social media has opened up new directions of research for timely identification and containment of fake news, and mitigation of its widespread impact on public opinion. While much of the earlier research was focused on identification of fake news based on its contents or by exploiting users' engagements with the news on social media, there has been a rising interest in proactive intervention strategies to counter the spread of misinformation and its impact on society. In this survey, we describe the modern-day problem of fake news and, in particular, highlight the technical challenges associated with it. We discuss existing methods and techniques applicable to both identification and mitigation, with a focus on the significant advances in each method and their advantages and limitations. In addition, research has often been limited by the quality of existing datasets and their specific application contexts. To alleviate this problem, we comprehensively compile and summarize characteristic features of available datasets. Furthermore, we outline new directions of research to facilitate future development of effective and interdisciplinary solutions.
△ Less
Submitted 18 January, 2019;
originally announced January 2019.
-
To Be Connected, or Not to Be Connected: That is the Minimum Inefficiency Subgraph Problem
Authors:
Natali Ruchansky,
Francesco Bonchi,
David Garcia-Soriano,
Francesco Gullo,
Nicolas Kourtellis
Abstract:
We study the problem of extracting a selective connector for a given set of query vertices $Q \subseteq V$ in a graph $G = (V,E)$. A selective connector is a subgraph of $G$ which exhibits some cohesiveness property, and contains the query vertices but does not necessarily connect them all. Relaxing the connectedness requirement allows the connector to detect multiple communities and to be toleran…
▽ More
We study the problem of extracting a selective connector for a given set of query vertices $Q \subseteq V$ in a graph $G = (V,E)$. A selective connector is a subgraph of $G$ which exhibits some cohesiveness property, and contains the query vertices but does not necessarily connect them all. Relaxing the connectedness requirement allows the connector to detect multiple communities and to be tolerant to outliers. We achieve this by introducing the new measure of network inefficiency and by instantiating our search for a selective connector as the problem of finding the minimum inefficiency subgraph.
We show that the minimum inefficiency subgraph problem is NP-hard, and devise efficient algorithms to approximate it. By means of several case studies in a variety of application domains (such as human brain, cancer, and food networks), we show that our minimum inefficiency subgraph produces high-quality solutions, exhibiting all the desired behaviors of a selective connector.
△ Less
Submitted 4 September, 2017;
originally announced September 2017.
-
Matrix completion with queries
Authors:
Natali Ruchansky,
Mark Crovella,
Evimaria Terzi
Abstract:
In many applications, e.g., recommender systems and traffic monitoring, the data comes in the form of a matrix that is only partially observed and low rank. A fundamental data-analysis task for these datasets is matrix completion, where the goal is to accurately infer the entries missing from the matrix. Even when the data satisfies the low-rank assumption, classical matrix-completion methods may…
▽ More
In many applications, e.g., recommender systems and traffic monitoring, the data comes in the form of a matrix that is only partially observed and low rank. A fundamental data-analysis task for these datasets is matrix completion, where the goal is to accurately infer the entries missing from the matrix. Even when the data satisfies the low-rank assumption, classical matrix-completion methods may output completions with significant error -- in that the reconstructed matrix differs significantly from the true underlying matrix. Often, this is due to the fact that the information contained in the observed entries is insufficient. In this work, we address this problem by proposing an active version of matrix completion, where queries can be made to the true underlying matrix. Subsequently, we design Order&Extend, which is the first algorithm to unify a matrix-completion approach and a querying strategy into a single algorithm. Order&Extend is able identify and alleviate insufficient information by judiciously querying a small number of additional entries. In an extensive experimental evaluation on real-world datasets, we demonstrate that our algorithm is efficient and is able to accurately reconstruct the true matrix while asking only a small number of queries.
△ Less
Submitted 30 April, 2017;
originally announced May 2017.
-
Targeted matrix completion
Authors:
Natali Ruchansky,
Mark Crovella,
Evimaria Terzi
Abstract:
Matrix completion is a problem that arises in many data-analysis settings where the input consists of a partially-observed matrix (e.g., recommender systems, traffic matrix analysis etc.). Classical approaches to matrix completion assume that the input partially-observed matrix is low rank. The success of these methods depends on the number of observed entries and the rank of the matrix; the large…
▽ More
Matrix completion is a problem that arises in many data-analysis settings where the input consists of a partially-observed matrix (e.g., recommender systems, traffic matrix analysis etc.). Classical approaches to matrix completion assume that the input partially-observed matrix is low rank. The success of these methods depends on the number of observed entries and the rank of the matrix; the larger the rank, the more entries need to be observed in order to accurately complete the matrix. In this paper, we deal with matrices that are not necessarily low rank themselves, but rather they contain low-rank submatrices. We propose Targeted, which is a general framework for completing such matrices. In this framework, we first extract the low-rank submatrices and then apply a matrix-completion algorithm to these low-rank submatrices as well as the remainder matrix separately. Although for the completion itself we use state-of-the-art completion methods, our results demonstrate that Targeted achieves significantly smaller reconstruction errors than other classical matrix-completion methods. One of the key technical contributions of the paper lies in the identification of the low-rank submatrices from the input partially-observed matrices.
△ Less
Submitted 30 April, 2017;
originally announced May 2017.
-
CSI: A Hybrid Deep Model for Fake News Detection
Authors:
Natali Ruchansky,
Sungyong Seo,
Yan Liu
Abstract:
The topic of fake news has drawn attention both from the public and the academic communities. Such misinformation has the potential of affecting public opinion, providing an opportunity for malicious parties to manipulate the outcomes of public events such as elections. Because such high stakes are at play, automatically detecting fake news is an important, yet challenging problem that is not yet…
▽ More
The topic of fake news has drawn attention both from the public and the academic communities. Such misinformation has the potential of affecting public opinion, providing an opportunity for malicious parties to manipulate the outcomes of public events such as elections. Because such high stakes are at play, automatically detecting fake news is an important, yet challenging problem that is not yet well understood. Nevertheless, there are three generally agreed upon characteristics of fake news: the text of an article, the user response it receives, and the source users promoting it. Existing work has largely focused on tailoring solutions to one particular characteristic which has limited their success and generality. In this work, we propose a model that combines all three characteristics for a more accurate and automated prediction. Specifically, we incorporate the behavior of both parties, users and articles, and the group behavior of users who propagate fake news. Motivated by the three characteristics, we propose a model called CSI which is composed of three modules: Capture, Score, and Integrate. The first module is based on the response and text; it uses a Recurrent Neural Network to capture the temporal pattern of user activity on a given article. The second module learns the source characteristic based on the behavior of users, and the two are integrated with the third module to classify an article as fake or not. Experimental analysis on real-world data demonstrates that CSI achieves higher accuracy than existing models, and extracts meaningful latent representations of both users and articles.
△ Less
Submitted 3 September, 2017; v1 submitted 20 March, 2017;
originally announced March 2017.
-
The Minimum Wiener Connector
Authors:
Natali Ruchansky,
Francesco Bonchi,
David Garcia-Soriano,
Francesco Gullo,
Nicolas Kourtellis
Abstract:
The Wiener index of a graph is the sum of all pairwise shortest-path distances between its vertices. In this paper we study the novel problem of finding a minimum Wiener connector: given a connected graph $G=(V,E)$ and a set $Q\subseteq V$ of query vertices, find a subgraph of $G$ that connects all query vertices and has minimum Wiener index.
We show that The Minimum Wiener Connector admits a po…
▽ More
The Wiener index of a graph is the sum of all pairwise shortest-path distances between its vertices. In this paper we study the novel problem of finding a minimum Wiener connector: given a connected graph $G=(V,E)$ and a set $Q\subseteq V$ of query vertices, find a subgraph of $G$ that connects all query vertices and has minimum Wiener index.
We show that The Minimum Wiener Connector admits a polynomial-time (albeit impractical) exact algorithm for the special case where the number of query vertices is bounded. We show that in general the problem is NP-hard, and has no PTAS unless $\mathbf{P} = \mathbf{NP}$. Our main contribution is a constant-factor approximation algorithm running in time $\widetilde{O}(|Q||E|)$.
A thorough experimentation on a large variety of real-world graphs confirms that our method returns smaller and denser solutions than other methods, and does so by adding to the query set $Q$ a small number of important vertices (i.e., vertices with high centrality).
△ Less
Submitted 16 October, 2016; v1 submitted 2 April, 2015;
originally announced April 2015.