Skip to main content

Showing 1–6 of 6 results for author: Lutov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:1912.08808  [pdf, other

    cs.SI cs.LG stat.ML

    Bridging the Gap between Community and Node Representations: Graph Embedding via Community Detection

    Authors: Artem Lutov, Dingqi Yang, Philippe Cudré-Mauroux

    Abstract: Graph embedding has become a key component of many data mining and analysis systems. Current graph embedding approaches either sample a large number of node pairs from a graph to learn node embeddings via stochastic optimization or factorize a high-order proximity/adjacency matrix of the graph via computationally expensive matrix factorization techniques. These approaches typically require signifi… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: IEEE BigData'19, Special Session on Information Granulation in Data Science and Scalable Computing

    MSC Class: 05C60 (Primary); 14E25; 30L05; 54C25; 57N35; 91C20 (Secondary); 05C85 (Secondary); 62G35 (Secondary); 91D30 (Secondary); 68T30 (Secondary) ACM Class: I.2.6; E.1; F.2.2; H.3.4

  2. arXiv:1909.08786  [pdf, other

    cs.SI cs.DS cs.LG physics.soc-ph stat.ML

    DAOC: Stable Clustering of Large Networks

    Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: Clustering is a crucial component of many data mining systems involving the analysis and exploration of various data. Data diversity calls for clustering algorithms to be accurate while providing stable (i.e., deterministic and robust) results on arbitrary input networks. Moreover, modern systems often operate with large datasets, which implicitly constrains the complexity of the clustering algori… ▽ More

    Submitted 17 December, 2019; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: IEEE BigData'19, Special Session on Intelligent Data Mining

    MSC Class: 91C20 (Primary); 05C85; 62G35; 90B10; 91D30 (Secondary); 68T30 (Secondary) ACM Class: H.3.3; H.3.4; H.1.1; H.2.8; I.2.6

  3. arXiv:1902.01691  [pdf, other

    cs.DS physics.data-an

    Accuracy Evaluation of Overlap** and Multi-resolution Clustering Algorithms on Large Datasets

    Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: Performance of clustering algorithms is evaluated with the help of accuracy metrics. There is a great diversity of clustering algorithms, which are key components of many data analysis and exploration systems. However, there exist only few metrics for the accuracy measurement of overlap** and multi-resolution clustering algorithms on large datasets. In this paper, we first discuss existing metri… ▽ More

    Submitted 14 February, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: The application executable and sources: https://github.com/eXascaleInfolab/xmeasures

    MSC Class: 62H30; 33F05; 91C20; 68N30

    Journal ref: 2019 IEEE International Conference on Big Data and Smart Computing

  4. arXiv:1902.00490  [pdf, other

    stat.AP cs.DS cs.SI physics.data-an

    StaTIX - Statistical Type Inference on Linked Data

    Authors: Artem Lutov, Soheil Roshankish, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: Large knowledge bases typically contain data adhering to various schemas with incomplete and/or noisy type information. This seriously complicates further integration and post-processing efforts, as type information is crucial in correctly handling the data. In this paper, we introduce a novel statistical type inference method, called StaTIX, to effectively infer instance types in Linked Data sets… ▽ More

    Submitted 16 February, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Application sources and executables: https://github.com/eXascaleInfolab/StaTIX

    MSC Class: 62H30; 91C20; 05C78; 62-07

    Journal ref: 2018 IEEE International Conference on Big Data

  5. arXiv:1902.00475  [pdf, other

    cs.DC eess.SY physics.data-an

    Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures

    Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: There is a great diversity of clustering and community detection algorithms, which are key components of many data analysis and exploration systems. To the best of our knowledge, however, there does not exist yet any uniform benchmarking framework, which is publicly available and suitable for the parallel benchmarking of diverse clustering algorithms on a wide range of synthetic and real-world dat… ▽ More

    Submitted 1 February, 2019; originally announced February 2019.

    Comments: Application sources and executables: https://github.com/eXascaleInfolab/clubmark

    MSC Class: 62H30; 65Y05; 68M20; 91C20

    Journal ref: 2018 IEEE International Conference on Data Mining Workshops (ICDMW)

  6. arXiv:1612.07636  [pdf, other

    cs.DL cs.SI physics.soc-ph

    ScienceWISE: Topic Modeling over Scientific Literature Networks

    Authors: Andrea Martini, Artem Lutov, Valerio Gemmetto, Andrii Magalich, Alessio Cardillo, Alex Constantin, Vasyl Palchykov, Mourad Khayati, Philippe Cudré-Mauroux, Alexey Boyarsky, Oleg Ruchayskiy, Diego Garlaschelli, Paolo De Los Rios, Karl Aberer

    Abstract: We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics. So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing. However, as the publication of research articles accelerates, the expressivity… ▽ More

    Submitted 22 December, 2016; originally announced December 2016.

    Comments: 6 pages; 5 figures