Skip to main content

Showing 1–5 of 5 results for author: Fedoryszak, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:1909.01436  [pdf, other

    stat.ML cs.IR cs.LG

    Discriminative Topic Modeling with Logistic LDA

    Authors: Iryna Korshunova, Hanchen Xiong, Mateusz Fedoryszak, Lucas Theis

    Abstract: Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the vast literature on LDA. We propose logistic LDA, a novel discriminative variant of latent Dirichlet allocation which is easy to apply to arbitrary inputs. In par… ▽ More

    Submitted 7 January, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Journal ref: Advances in Neural Information Processing Systems 32, 2019

  2. Real-time Event Detection on Social Data Streams

    Authors: Mateusz Fedoryszak, Brent Frederick, Vijay Rajaram, Changtao Zhong

    Abstract: Social networks are quickly becoming the primary medium for discussing what is happening around real-world events. The information that is generated on social platforms like Twitter can produce rich data streams for immediate insights into ongoing matters and the conversations around them. To tackle the problem of event detection, we model events as a list of clusters of trending entities over tim… ▽ More

    Submitted 25 July, 2019; originally announced July 2019.

    Comments: Accepted as a full paper at KDD 2019 on April 29, 2019

  3. arXiv:1303.6906  [pdf, ps, other

    cs.IR cs.DL

    Large scale citation matching using Apache Hadoop

    Authors: Mateusz Fedoryszak, Dominika Tkaczyk, Ɓukasz Bolikowski

    Abstract: During the process of citation matching links from bibliography entries to referenced publications are created. Such links are indicators of topical similarity between linked texts, are used in assessing the impact of the referenced document and improve navigation in the user interfaces of digital libraries. In this paper we present a citation matching method and show how to scale it up to handle… ▽ More

    Submitted 26 March, 2013; originally announced March 2013.

    Comments: 11 pages, 4 figures

    ACM Class: H.3.3

  4. arXiv:1303.5367  [pdf, ps, other

    cs.IR cs.DL

    Taming the zoo - about algorithms implementation in the ecosystem of Apache Hadoop

    Authors: Piotr Jan Dendek, Artur Czeczko, Mateusz Fedoryszak, Adam Kawa, Piotr Wendykier, Lukasz Bolikowski

    Abstract: Content Analysis System (CoAnSys) is a research framework for mining scientific publications using Apache Hadoop. This article describes the algorithms currently implemented in CoAnSys including classification, categorization and citation matching of scientific publications. The size of the input data classifies these algorithms in the range of big data problems, which can be efficiently solved on… ▽ More

    Submitted 16 March, 2014; v1 submitted 21 March, 2013; originally announced March 2013.

    Comments: This paper (with changed content) appeared under the title "Content Analysis of Scientific Articles in Apache Hadoop Ecosystem" in "Intelligent Tools for Building a Scientific Information Platform: From Research to Implementation", "Studies in Computational Intelligence", Volume 541, 2014, http://link.springer.com/book/10.1007/978-3-319-04714-0

    ACM Class: H.3.7

  5. arXiv:1303.5234  [pdf, ps, other

    cs.SE cs.DC

    How to perform research in Hadoop environment not losing mental equilibrium - case study

    Authors: Piotr Jan Dendek, Artur Czeczko, Mateusz Fedoryszak, Adam Kawa, Piotr Wendykier, Lukasz Bolikowski

    Abstract: Conducting a research in an efficient, repetitive, evaluable, but also convenient (in terms of development) way has always been a challenge. To satisfy those requirements in a long term and simultaneously minimize costs of the software engineering process, one has to follow a certain set of guidelines. This article describes such guidelines based on the research environment called Content Analysis… ▽ More

    Submitted 16 March, 2014; v1 submitted 21 March, 2013; originally announced March 2013.

    Comments: This paper (with changed content) appeared under the title "Chrum: The Tool for Convenient Generation of Apache Oozie Workflows" in "Intelligent Tools for Building a Scientific Information Platform: From Research to Implementation", "Studies in Computational Intelligence", Volume 541, 2014, http://link.springer.com/book/10.1007/978-3-319-04714-0

    ACM Class: H.3.7