Skip to main content

Showing 1–6 of 6 results for author: Khayati, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:1909.08786  [pdf, other

    cs.SI cs.DS cs.LG physics.soc-ph stat.ML

    DAOC: Stable Clustering of Large Networks

    Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: Clustering is a crucial component of many data mining systems involving the analysis and exploration of various data. Data diversity calls for clustering algorithms to be accurate while providing stable (i.e., deterministic and robust) results on arbitrary input networks. Moreover, modern systems often operate with large datasets, which implicitly constrains the complexity of the clustering algori… ▽ More

    Submitted 17 December, 2019; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: IEEE BigData'19, Special Session on Intelligent Data Mining

    MSC Class: 91C20 (Primary); 05C85; 62G35; 90B10; 91D30 (Secondary); 68T30 (Secondary) ACM Class: H.3.3; H.3.4; H.1.1; H.2.8; I.2.6

  2. arXiv:1902.01691  [pdf, other

    cs.DS physics.data-an

    Accuracy Evaluation of Overlap** and Multi-resolution Clustering Algorithms on Large Datasets

    Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: Performance of clustering algorithms is evaluated with the help of accuracy metrics. There is a great diversity of clustering algorithms, which are key components of many data analysis and exploration systems. However, there exist only few metrics for the accuracy measurement of overlap** and multi-resolution clustering algorithms on large datasets. In this paper, we first discuss existing metri… ▽ More

    Submitted 14 February, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: The application executable and sources: https://github.com/eXascaleInfolab/xmeasures

    MSC Class: 62H30; 33F05; 91C20; 68N30

    Journal ref: 2019 IEEE International Conference on Big Data and Smart Computing

  3. arXiv:1902.00490  [pdf, other

    stat.AP cs.DS cs.SI physics.data-an

    StaTIX - Statistical Type Inference on Linked Data

    Authors: Artem Lutov, Soheil Roshankish, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: Large knowledge bases typically contain data adhering to various schemas with incomplete and/or noisy type information. This seriously complicates further integration and post-processing efforts, as type information is crucial in correctly handling the data. In this paper, we introduce a novel statistical type inference method, called StaTIX, to effectively infer instance types in Linked Data sets… ▽ More

    Submitted 16 February, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Application sources and executables: https://github.com/eXascaleInfolab/StaTIX

    MSC Class: 62H30; 91C20; 05C78; 62-07

    Journal ref: 2018 IEEE International Conference on Big Data

  4. arXiv:1902.00475  [pdf, other

    cs.DC eess.SY physics.data-an

    Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures

    Authors: Artem Lutov, Mourad Khayati, Philippe Cudré-Mauroux

    Abstract: There is a great diversity of clustering and community detection algorithms, which are key components of many data analysis and exploration systems. To the best of our knowledge, however, there does not exist yet any uniform benchmarking framework, which is publicly available and suitable for the parallel benchmarking of diverse clustering algorithms on a wide range of synthetic and real-world dat… ▽ More

    Submitted 1 February, 2019; originally announced February 2019.

    Comments: Application sources and executables: https://github.com/eXascaleInfolab/clubmark

    MSC Class: 62H30; 65Y05; 68M20; 91C20

    Journal ref: 2018 IEEE International Conference on Data Mining Workshops (ICDMW)

  5. arXiv:1710.09788  [pdf, other

    cs.AI

    FashionBrain Project: A Vision for Understanding Europe's Fashion Data Universe

    Authors: Alessandro Checco, Gianluca Demartini, Alexander Loeser, Ines Arous, Mourad Khayati, Matthias Dantone, Richard Koopmanschap, Svetlin Stalinov, Martin Kersten, Ying Zhang

    Abstract: A core business in the fashion industry is the understanding and prediction of customer needs and trends. Search engines and social networks are at the same time a fundamental bridge and a costly middleman between the customer's purchase intention and the retailer. To better exploit Europe's distinctive characteristics e.g., multiple languages, fashion and cultural differences, it is pivotal to re… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

  6. arXiv:1612.07636  [pdf, other

    cs.DL cs.SI physics.soc-ph

    ScienceWISE: Topic Modeling over Scientific Literature Networks

    Authors: Andrea Martini, Artem Lutov, Valerio Gemmetto, Andrii Magalich, Alessio Cardillo, Alex Constantin, Vasyl Palchykov, Mourad Khayati, Philippe Cudré-Mauroux, Alexey Boyarsky, Oleg Ruchayskiy, Diego Garlaschelli, Paolo De Los Rios, Karl Aberer

    Abstract: We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics. So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing. However, as the publication of research articles accelerates, the expressivity… ▽ More

    Submitted 22 December, 2016; originally announced December 2016.

    Comments: 6 pages; 5 figures