Skip to main content

Showing 1–5 of 5 results for author: Wurzer, D

Searching in archive cs. Search in all archives.
.
  1. How UMass-FSD Inadvertently Leverages Temporal Bias

    Authors: Dominik Wurzer, Yumeng Qin

    Abstract: First Story Detection describes the task of identifying new events in a stream of documents. The UMass-FSD system is known for its strong performance in First Story Detection competitions. Recently, it has been frequently used as a high accuracy baseline in research publications. We are the first to discover that UMass-FSD inadvertently leverages temporal bias. Interestingly, the discovered bias c… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Temporal Bias, First Story Detection, Topic Detection and Tracking, UMass-FSD, LSH-FSD

    Journal ref: SIGIR 20, July 2020

  2. Parameterizing Kterm Hashing

    Authors: Dominik Wurzer, Yumeng Qin

    Abstract: Kterm Hashing provides an innovative approach to novelty detection on massive data streams. Previous research focused on maximizing the efficiency of Kterm Hashing and succeeded in scaling First Story Detection to Twitter-size data stream without sacrificing detection accuracy. In this paper, we focus on improving the effectiveness of Kterm Hashing. Traditionally, all kterms are considered as equa… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Kterm Hashing, Novelty Detection, First Story Detection

    Journal ref: SIGIR 18, July 2018, Ann Arbor, MI, USA

  3. arXiv:1701.01737  [pdf, other

    cs.IR

    Spotting Information biases in Chinese and Western Media

    Authors: Dominik Wurzer, Yumeng Qin

    Abstract: Newswire and Social Media are the major sources of information in our time. While the topical demographic of Western Media was subjects of studies in the past, less is known about Chinese Media. In this paper, we apply event detection and tracking technology to examine the information overlap and differences between Chinese and Western - Traditional Media and Social Media. Our experiments reveal a… ▽ More

    Submitted 6 January, 2017; originally announced January 2017.

  4. arXiv:1611.06322  [pdf, other

    cs.SI cs.CL cs.IR

    Spotting Rumors via Novelty Detection

    Authors: Yumeng Qin, Dominik Wurzer, Victor Lavrenko, Cunchen Tang

    Abstract: Rumour detection is hard because the most accurate systems operate retrospectively, only recognizing rumours once they have collected repeated signals. By then the rumours might have already spread and caused harm. We introduce a new category of features based on novelty, tailored to detect rumours early on. To compensate for the absence of repeated signals, we make use of news wire as an addition… ▽ More

    Submitted 19 November, 2016; originally announced November 2016.

  5. arXiv:1607.02641  [pdf, other

    cs.IR

    Randomised Relevance Model

    Authors: Dominik Wurzer, Miles Osborne, Victor Lavrenko

    Abstract: Relevance Models are well-known retrieval models and capable of producing competitive results. However, because they use query expansion they can be very slow. We address this slowness by incorporating two variants of locality sensitive hashing (LSH) into the query expansion process. Results on two document collections suggest that we can obtain large reductions in the amount of work, with a small… ▽ More

    Submitted 9 July, 2016; originally announced July 2016.

    Comments: Information Retrieval, Query Expansion, Locality Sensitive Hashing, Randomized Algorithm, Relevance Model