Search | arXiv e-print repository

Spotting Rumors via Novelty Detection

Authors: Yumeng Qin, Dominik Wurzer, Victor Lavrenko, Cunchen Tang

Abstract: Rumour detection is hard because the most accurate systems operate retrospectively, only recognizing rumours once they have collected repeated signals. By then the rumours might have already spread and caused harm. We introduce a new category of features based on novelty, tailored to detect rumours early on. To compensate for the absence of repeated signals, we make use of news wire as an addition… ▽ More Rumour detection is hard because the most accurate systems operate retrospectively, only recognizing rumours once they have collected repeated signals. By then the rumours might have already spread and caused harm. We introduce a new category of features based on novelty, tailored to detect rumours early on. To compensate for the absence of repeated signals, we make use of news wire as an additional data source. Unconfirmed (novel) information with respect to the news articles is considered as an indication of rumours. Additionally we introduce pseudo feedback, which assumes that documents that are similar to previous rumours, are more likely to also be a rumour. Comparison with other real-time approaches shows that novelty based features in conjunction with pseudo feedback perform significantly better, when detecting rumours instantly after their publication. △ Less

Submitted 19 November, 2016; originally announced November 2016.

arXiv:1607.02641 [pdf, other]

Randomised Relevance Model

Authors: Dominik Wurzer, Miles Osborne, Victor Lavrenko

Abstract: Relevance Models are well-known retrieval models and capable of producing competitive results. However, because they use query expansion they can be very slow. We address this slowness by incorporating two variants of locality sensitive hashing (LSH) into the query expansion process. Results on two document collections suggest that we can obtain large reductions in the amount of work, with a small… ▽ More Relevance Models are well-known retrieval models and capable of producing competitive results. However, because they use query expansion they can be very slow. We address this slowness by incorporating two variants of locality sensitive hashing (LSH) into the query expansion process. Results on two document collections suggest that we can obtain large reductions in the amount of work, with a small reduction in effectiveness. Our approach is shown to be additive when pruning query terms. △ Less

Submitted 9 July, 2016; originally announced July 2016.

Comments: Information Retrieval, Query Expansion, Locality Sensitive Hashing, Randomized Algorithm, Relevance Model

arXiv:1305.3107 [pdf, ps, other]

I Wish I Didn't Say That! Analyzing and Predicting Deleted Messages in Twitter

Authors: Sasa Petrovic, Miles Osborne, Victor Lavrenko

Abstract: Twitter has become a major source of data for social media researchers. One important aspect of Twitter not previously considered are {\em deletions} -- removal of tweets from the stream. Deletions can be due to a multitude of reasons such as privacy concerns, rashness or attempts to undo public statements. We show how deletions can be automatically predicted ahead of time and analyse which tweets… ▽ More Twitter has become a major source of data for social media researchers. One important aspect of Twitter not previously considered are {\em deletions} -- removal of tweets from the stream. Deletions can be due to a multitude of reasons such as privacy concerns, rashness or attempts to undo public statements. We show how deletions can be automatically predicted ahead of time and analyse which tweets are likely to be deleted and how. △ Less

Submitted 14 May, 2013; originally announced May 2013.

Comments: Unpublished

Showing 1–3 of 3 results for author: Lavrenko, V