Skip to main content

Showing 1–3 of 3 results for author: Mandros, P

.
  1. arXiv:1908.11682  [pdf, other

    cs.LG cs.DB cs.IT stat.ML

    Discovering Reliable Correlations in Categorical Data

    Authors: Panagiotis Mandros, Mario Boley, Jilles Vreeken

    Abstract: In many scientific tasks we are interested in discovering whether there exist any correlations in our data. This raises many questions, such as how to reliably and interpretably measure correlation between a multivariate set of attributes, how to do so without having to make assumptions on distribution of the data or the type of correlation, and, how to efficiently discover the top-most reliably c… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: Accepted to the IEEE International Conference on Data Mining 2019 (ICDM'19)

    ACM Class: H.2.8; G.3

  2. arXiv:1809.05467  [pdf, other

    cs.AI cs.DB cs.IT

    Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms

    Authors: Panagiotis Mandros, Mario Boley, Jilles Vreeken

    Abstract: The reliable fraction of information is an attractive score for quantifying (functional) dependencies in high-dimensional data. In this paper, we systematically explore the algorithmic implications of using this measure for optimization. We show that the problem is NP-hard, which justifies the usage of worst-case exponential-time as well as heuristic search methods. We then substantially improve t… ▽ More

    Submitted 14 September, 2018; originally announced September 2018.

    Comments: Accepted to Proceedings of the IEEE International Conference on Data Mining (ICDM'18)

    ACM Class: H.2.8; G.3

  3. arXiv:1705.09391  [pdf, other

    cs.DB cs.AI cs.IT

    Discovering Reliable Approximate Functional Dependencies

    Authors: Panagiotis Mandros, Mario Boley, Jilles Vreeken

    Abstract: Given a database and a target attribute of interest, how can we tell whether there exists a functional, or approximately functional dependence of the target on any set of other attributes in the data? How can we reliably, without bias to sample size or dimensionality, measure the strength of such a dependence? And, how can we efficiently discover the optimal or $α$-approximate top-$k$ dependencies… ▽ More

    Submitted 18 June, 2017; v1 submitted 25 May, 2017; originally announced May 2017.

    Comments: Accepted: In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), August 13-17, 2017, Halifax, NS, Canada

    ACM Class: H.2.8; G.3