Skip to main content

Showing 1–2 of 2 results for author: Van Canneyt, S

.
  1. Representation learning for very short texts using weighted word embedding aggregation

    Authors: Cedric De Boom, Steven Van Canneyt, Thomas Demeester, Bart Dhoedt

    Abstract: Short text messages such as tweets are very noisy and sparse in their use of vocabulary. Traditional textual representations, such as tf-idf, have difficulty gras** the semantic meaning of such texts, which is important in applications such as event detection, opinion mining, news recommendation, etc. We constructed a method based on semantic word embeddings and frequency information to arrive a… ▽ More

    Submitted 2 July, 2016; originally announced July 2016.

    Comments: 8 pages, 3 figures, 2 tables, appears in Pattern Recognition Letters

  2. Learning Semantic Similarity for Very Short Texts

    Authors: Cedric De Boom, Steven Van Canneyt, Steven Bohez, Thomas Demeester, Bart Dhoedt

    Abstract: Levering data on social media, such as Twitter and Facebook, requires information retrieval algorithms to become able to relate very short text fragments to each other. Traditional text similarity methods such as tf-idf cosine-similarity, based on word overlap, mostly fail to produce good results in this case, since word overlap is little or non-existent. Recently, distributed word representations… ▽ More

    Submitted 2 December, 2015; originally announced December 2015.

    Comments: 6 pages, 5 figures, 3 tables, ReLSD workshop at ICDM 15