Skip to main content

Showing 1–27 of 27 results for author: Mamoulis, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.04069  [pdf, other

    cs.DB

    Spatio-temporal flow patterns

    Authors: Chrysanthi Kosyfaki, Nikos Mamoulis, Reynold Cheng, Ben Kao

    Abstract: Transportation companies and organizations routinely collect huge volumes of passenger transportation data. By aggregating these data (e.g., counting the number of passengers going from a place to another in every 30 minute interval), it becomes possible to analyze the movement behavior of passengers in a metropolitan area. In this paper, we study the problem of finding important trends in passeng… ▽ More

    Submitted 12 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  2. arXiv:2307.09256  [pdf, other

    cs.DB

    Two-layer Space-oriented Partitioning for Non-point Data

    Authors: Dimitrios Tsitsigkos, Panagiotis Bouros, Konstantinos Lampropoulos, Nikos Mamoulis, Manolis Terrovitis

    Abstract: Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiquitous. We study the problem of indexing non-point objects in memory for range queries and spatial intersection joins. We propose a secondary partitioning technique for space-oriented partitioning indices (e.g., grids), which improves their performance significantly, by avoiding the generation and elimination of duplicate result… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: To appear in the IEEE Transactions on Knowledge and Data Engineering

  3. arXiv:2307.01716  [pdf, other

    cs.DB

    Raster Interval Object Approximations for Spatial Intersection Joins

    Authors: Thanasis Georgiadis, Eleni Tzirita Zacharatou, Nikos Mamoulis

    Abstract: Spatial join processing techniques that identify intersections between complex geometries (e.g.,polygons) commonly follow a two-step filter-and-refine pipeline; the filter step evaluates the query predicate on the minimum bounding rectangles (MBRs) of objects and the refinement step eliminates false positives by applying the query on the exact geometries. We propose a raster intervals approximatio… ▽ More

    Submitted 1 May, 2024; v1 submitted 4 July, 2023; originally announced July 2023.

    Comments: 34 pages

  4. arXiv:2205.01905  [pdf, other

    cs.DB

    Three-dimensional Geospatial Interlinking with JedAI-spatial

    Authors: Marios Papamichalopoulos, George Papadakis, George Mandilaras, Maria Despoina Siampou, Nikos Mamoulis, Manolis Koubarakis

    Abstract: Geospatial data constitutes a considerable part of (Semantic) Web data, but so far, its sources are inadequately interlinked in the Linked Open Data cloud. Geospatial Interlinking aims to cover this gap by associating geometries with topological relations like those of the Dimensionally Extended 9-Intersection Model. Due to its quadratic time complexity, various algorithms aim to carry out Geospat… ▽ More

    Submitted 10 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

  5. arXiv:2112.08638  [pdf, other

    cs.DB

    Evaluating Hybrid Graph Pattern Queries Using Runtime Index Graphs

    Authors: Xiaoying Wu, Dimitri Theodoratos, Nikos Mamoulis, Michael Lan

    Abstract: Graph pattern matching is a fundamental operation for the analysis and exploration ofdata graphs. In thispaper, we presenta novel approachfor efficiently finding homomorphic matches for hybrid graph patterns, where each pattern edge may be mapped either to an edge or to a path in the input data, thus allowing for higher expressiveness and flexibility in query formulation. A key component of our ap… ▽ More

    Submitted 28 September, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

  6. arXiv:2110.05041  [pdf, other

    cs.DB

    Provenance in Temporal Interaction Networks

    Authors: Chrysanthi Kosyfaki Nikos Mamoulis

    Abstract: In temporal interaction networks, vertices correspond to entities, which exchange data quantities (e.g., money, bytes, messages) over time. Tracking the origin of data that have reached a given vertex at any time can help data analysts to understand the reasons behind the accumulated quantity at the vertex or behind the interactions between entities. In this paper, we study data provenance in a te… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  7. arXiv:2104.10939  [pdf, other

    cs.DB cs.DS

    HINT: A Hierarchical Index for Intervals in Main Memory

    Authors: George Christodoulou, Panagiotis Bouros, Nikos Mamoulis

    Abstract: Indexing intervals is a fundamental problem, finding a wide range of applications. Recent work on managing large collections of intervals in main memory focused on overlap joins and temporal aggregation problems. In this paper, we propose novel and efficient in-memory indexing techniques for intervals, with a focus on interval range queries, which are a basic component of many search and analysis… ▽ More

    Submitted 7 March, 2022; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: Preprint of the "HINT: A Hierarchical Index for Intervals in Main Memory" paper to appear at the 2022 ACM SIGMOD/PODS International Conference on Management of Data, Philadelphia, PA, USA, June 12-17, 2022

  8. Discovering Closed and Maximal Embedded Patterns from Large Tree Data

    Authors: Xiaoying Wu, Dimitri Theodoratos, Nikos Mamoulis

    Abstract: We address the problem of summarizing embedded tree patterns extracted from large data trees. We do so by defining and mining closed and maximal embedded unordered tree patterns from a single large data tree. We design an embedded frequent pattern mining algorithm extended with a local closedness checking technique. This algorithm is called {\em closedEmbTM-prune} as it eagerly eliminates non-clos… ▽ More

    Submitted 26 December, 2020; originally announced December 2020.

  9. arXiv:2012.10024  [pdf, other

    cs.LG cs.SI

    Leveraging Meta-path Contexts for Classification in Heterogeneous Information Networks

    Authors: Xiang Li, Danhao Ding, Ben Kao, Yizhou Sun, Nikos Mamoulis

    Abstract: A heterogeneous information network (HIN) has as vertices objects of different types and as edges the relations between objects, which are also of various types. We study the problem of classifying objects in HINs. Most existing methods perform poorly when given scarce labeled objects as training sets, and methods that improve classification accuracy under such scenarios are often computationally… ▽ More

    Submitted 20 February, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: Accepted by ICDE 2021

  10. arXiv:2009.02845  [pdf, other

    cs.LG cs.CR cs.DC stat.ML

    Fast and Secure Distributed Nonnegative Matrix Factorization

    Authors: Yuqiu Qian, Conghui Tan, Danhao Ding, Hui Li, Nikos Mamoulis

    Abstract: Nonnegative matrix factorization (NMF) has been successfully applied in several data mining tasks. Recently, there is an increasing interest in the acceleration of NMF, due to its high cost on large matrices. On the other hand, the privacy issue of NMF over federated data is worthy of attention, since NMF is prevalently applied in image and text analysis which may involve leveraging privacy data (… ▽ More

    Submitted 6 September, 2020; originally announced September 2020.

    Comments: Published in IEEE Transactions on Knowledge and Data Engineering (TKDE). This arXiv version includes the appendices with additional proofs

  11. arXiv:2005.14431  [pdf, other

    cs.SI

    Fairness-Aware PageRank

    Authors: Sotiris Tsioutsiouliklis, Evaggelia Pitoura, Panayiotis Tsaparas, Ilias Kleftakis, Nikos Mamoulis

    Abstract: Algorithmic fairness has attracted significant attention in the past years. Surprisingly, there is little work on fairness in networks. In this work, we consider fairness for link analysis algorithms and in particular for the celebrated PageRank algorithm. We provide definitions for fairness, and propose two approaches for achieving fairness. The first modifies the jump vector of the Pagerank algo… ▽ More

    Submitted 23 March, 2021; v1 submitted 29 May, 2020; originally announced May 2020.

    ACM Class: H.3.1; H.3.3; H.3.5; I.6.4; F.2.1; F.3.1

  12. arXiv:2005.08600  [pdf, other

    cs.DB

    A Two-level Spatial In-Memory Index

    Authors: Dimitrios Tsitsigkos, Konstantinos Lampropoulos, Panagiotis Bouros, Nikos Mamoulis, Manolis Terrovitis

    Abstract: Very large volumes of spatial data increasingly become available and demand effective management. While there has been decades of research on spatial data management, few works consider the current state of commodity hardware, having relatively large memory and the ability of parallel multi-core processing. In this work, we re-consider the design of spatial indexing under this new reality. Specifi… ▽ More

    Submitted 23 February, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

    Journal ref: Preliminary version of the "A Two-layer Partitioning for Non-point Spatial Data'' paper at the 37th IEEE International Conference on Data Engineering (ICDE), Chania, Crete, Greece, April 19-22, 2021

  13. arXiv:2003.01974  [pdf, other

    cs.DB

    Flow Computation in Temporal Interaction Networks

    Authors: Chrysanthi Kosyfaki, Nikos Mamoulis, Evaggelia Pitoura, Panayiotis Tsaparas

    Abstract: Temporal interaction networks capture the history of activities between entities along a timeline. At each interaction, some quantity of data (money, information, kbytes, etc.) flows from one vertex of the network to another. Flow-based analysis can reveal important information. For instance, financial intelligent units (FIUs) are interested in finding subgraphs in transactions networks with signi… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: 13 pages, 27 figures

  14. arXiv:1911.01042  [pdf, other

    cs.DB cs.LG

    A General Early-Stop** Module for Crowdsourced Ranking

    Authors: Caihua Shan, Leong Hou U, Nikos Mamoulis, Reynold Cheng, Xiang Li

    Abstract: Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

  15. arXiv:1911.01030  [pdf, other

    cs.LG cs.DB stat.ML

    An End-to-End Deep RL Framework for Task Arrangement in Crowdsourcing Platforms

    Authors: Caihua Shan, Nikos Mamoulis, Reynold Cheng, Guoliang Li, Xiang Li, Yuqiu Qian

    Abstract: In this paper, we propose a Deep Reinforcement Learning (RL) framework for task arrangement, which is a critical problem for the success of crowdsourcing platforms. Previous works conduct the personalized recommendation of tasks to workers via supervised learning methods. However, the majority of them only consider the benefit of either workers or requesters independently. In addition, they cannot… ▽ More

    Submitted 3 November, 2019; originally announced November 2019.

  16. Parallel In-Memory Evaluation of Spatial Joins

    Authors: Dimitrios Tsitsigkos, Panagiotis Bouros, Nikos Mamoulis, Manolis Terrovitis

    Abstract: The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-de… ▽ More

    Submitted 9 October, 2019; v1 submitted 30 August, 2019; originally announced August 2019.

    Comments: Extended version of the SIGSPATIAL'19 paper under the same title

    Journal ref: 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS 2019), Chicago, Illinois, USA, November 5-8, 2019

  17. arXiv:1810.08408  [pdf, other

    cs.SI

    Flow Motifs in Interaction Networks

    Authors: Chrysanthi Kosyfaki, Nikos Mamoulis, Evaggelia Pitoura, Panayiotis Tsaparas

    Abstract: Many real-world phenomena are best represented as interaction networks with dynamic structures (e.g., transaction networks, social networks, traffic networks). Interaction networks capture flow of data which is transferred between their vertices along a timeline. Analyzing such networks is crucial toward comprehend- ing processes in them. A typical analysis task is the finding of motifs, which are… ▽ More

    Submitted 19 October, 2018; originally announced October 2018.

  18. arXiv:1708.02125  [pdf, other

    cs.DB

    T-Crowd: Effective Crowdsourcing for Tabular Data

    Authors: Caihua Shan, Nikos Mamoulis, Guoliang Li, Reynold Cheng, Zhipeng Huang, Yudian Zheng

    Abstract: Crowdsourcing employs human workers to solve computer-hard problems, such as data cleaning, entity resolution, and sentiment analysis. When crowdsourcing tabular data, e.g., the attribute values of an entity set, a worker's answers on the different attributes (e.g., the nationality and age of a celebrity star) are often treated independently. This assumption is not always true and can lead to subo… ▽ More

    Submitted 7 August, 2017; originally announced August 2017.

  19. arXiv:1701.05291  [pdf, other

    cs.AI

    Heterogeneous Information Network Embedding for Meta Path based Proximity

    Authors: Zhipeng Huang, Nikos Mamoulis

    Abstract: A network embedding is a representation of a large graph in a low-dimensional space, where vertices are modeled as vectors. The objective of a good embedding is to preserve the proximity between vertices in the original graph. This way, typical search and mining methods can be applied in the embedded space with the help of off-the-shelf multidimensional indexing approaches. Existing network embedd… ▽ More

    Submitted 18 January, 2017; originally announced January 2017.

  20. Set Containment Join Revisited

    Authors: Panagiotis Bouros, Nikos Mamoulis, Shen Ge, Manolis Terrovitis

    Abstract: Given two collections of set objects $R$ and $S$, the $R \bowtie_{\subseteq} S$ set containment join returns all object pairs $(r, s) \in R \times S$ such that $r \subseteq s$. Besides being a basic operator in all modern data management systems with a wide range of applications, the join can be used to evaluate complex SQL queries based on relational division and as a module of data mining algori… ▽ More

    Submitted 17 March, 2016; originally announced March 2016.

    Comments: To appear at the Knowledge and Information Systems journal (KAIS)

  21. arXiv:1505.02728  [pdf, ps, other

    cs.DB

    Adaptive Partitioning for Very Large RDF Data

    Authors: Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, Majed Sahli

    Abstract: Distributed RDF systems partition data across multiple computer nodes (workers). Some systems perform cheap hash partitioning, which may result in expensive query evaluation, while others apply heuristics aiming at minimizing inter-node communication during query evaluation. This requires an expensive data preprocessing phase, leading to high startup costs for very large RDF knowledge bases. Aprio… ▽ More

    Submitted 11 May, 2015; originally announced May 2015.

    Comments: 25 pages

  22. arXiv:1305.3407  [pdf, other

    cs.DB

    Probabilistic Nearest Neighbor Queries on Uncertain Moving Object Trajectories

    Authors: Johannes Niedermayer, Andreas Züfle, Tobias Emrich, Matthias Renz, Nikos Mamoulis, Lei Chen, Hans-Peter Kriegel

    Abstract: Nearest neighbor (NN) queries in trajectory databases have received significant attention in the past, due to their application in spatio-temporal data analysis. Recent work has considered the realistic case where the trajectories are uncertain; however, only simple uncertainty models have been proposed, which do not allow for accurate probabilistic search. In this paper, we fill this gap by addre… ▽ More

    Submitted 20 January, 2014; v1 submitted 15 May, 2013; originally announced May 2013.

    Comments: 12 pages

    Journal ref: PVLDB 7(3): 205-216 (2013)

  23. arXiv:1207.0135  [pdf, other

    cs.DB

    Privacy Preservation by Disassociation

    Authors: Manolis Terrovitis, John Liagouris, Nikos Mamoulis, Spiros Skiadopoulos

    Abstract: In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction… ▽ More

    Submitted 30 June, 2012; originally announced July 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 10, pp. 944-955 (2012)

  24. arXiv:1111.7169  [pdf, other

    cs.DB

    Size-l Object Summaries for Relational Keyword Search

    Authors: Georgios J. Fakas, Zhi Cai, Nikos Mamoulis

    Abstract: A previously proposed keyword search paradigm produces, as a query result, a ranked list of Object Summaries (OSs). An OS is a tree structure of related tuples that summarizes all data held in a relational database about a particular Data Subject (DS). However, some of these OSs are very large in size and therefore unfriendly to users that initially prefer synoptic information before proceeding to… ▽ More

    Submitted 30 November, 2011; originally announced November 2011.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 3, pp. 229-240 (2011)

  25. arXiv:1103.0172  [pdf, other

    cs.DB

    Inverse Queries For Multidimensional Spaces

    Authors: Thomas Bernecker, Tobias Emrich, Hans-Peter Kriegel, Nikos Mamoulis, Matthias Renz, Shiming Zhang, Andreas Züfle

    Abstract: Traditional spatial queries return, for a given query object $q$, all database objects that satisfy a given predicate, such as epsilon range and $k$-nearest neighbors. This paper defines and studies {\em inverse} spatial queries, which, given a subset of database objects $Q$ and a query predicate, return all objects which, if used as query objects with the predicate, contain $Q$ in their result. W… ▽ More

    Submitted 5 May, 2011; v1 submitted 1 March, 2011; originally announced March 2011.

  26. arXiv:1101.2613  [pdf, other

    cs.DB

    A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases

    Authors: Thomas Bernecker, Tobias Emrich, Hans-Peter Kriegel, Nikos Mamoulis, Matthias Renz, Andreas Zuefle

    Abstract: In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for probabilistic similarity queries on uncertain data. Our approach supports a general uncertainty model using continuous probabilistic density functions to describe the (possibly correlated) uncertain attributes of objects. In a nutshell, the problem to be solved is to compute the PDF of the random variabl… ▽ More

    Submitted 5 May, 2011; v1 submitted 13 January, 2011; originally announced January 2011.

  27. arXiv:0907.2868  [pdf, other

    cs.DB cs.IR

    Scalable Probabilistic Similarity Ranking in Uncertain Databases (Technical Report)

    Authors: Thomas Bernecker, Hans-Peter Kriegel, Nikos Mamoulis, Matthias Renz, Andreas Zuefle

    Abstract: This paper introduces a scalable approach for probabilistic top-k similarity ranking on uncertain vector data. Each uncertain object is represented by a set of vector instances that are assumed to be mutually-exclusive. The objective is to rank the uncertain data according to their distance to a reference object. We propose a framework that incrementally computes for each object instance and ran… ▽ More

    Submitted 16 July, 2009; originally announced July 2009.