-
Secrecy: Secure collaborative analytics on secret-shared data
Authors:
John Liagouris,
Vasiliki Kalavri,
Muhammad Faisal,
Mayank Varia
Abstract:
We present a relational MPC framework for secure collaborative analytics on private data with no information leakage. Our work targets challenging use cases where data owners may not have private resources to participate in the computation, thus, they need to securely outsource the data analysis to untrusted third parties. We define a set of oblivious operators, explain the secure primitives they…
▽ More
We present a relational MPC framework for secure collaborative analytics on private data with no information leakage. Our work targets challenging use cases where data owners may not have private resources to participate in the computation, thus, they need to securely outsource the data analysis to untrusted third parties. We define a set of oblivious operators, explain the secure primitives they rely on, and analyze their costs in terms of operations and inter-party communication. We show how these operators can be composed to form end-to-end oblivious queries, and we introduce logical and physical optimizations that dramatically reduce the space and communication requirements during query execution, in some cases from quadratic to linear or from linear to logarithmic with respect to the cardinality of the input.
We implement our framework on top of replicated secret sharing in a system called Secrecy and evaluate it using real queries from several MPC application areas. Our experiments demonstrate that the proposed optimizations can result in over 1000x lower execution times compared to baseline approaches, enabling Secrecy to outperform state-of-the-art frameworks and compute MPC queries on millions of input rows with a single thread per party.
△ Less
Submitted 3 February, 2022; v1 submitted 1 February, 2021;
originally announced February 2021.
-
Megaphone: Latency-conscious state migration for distributed streaming dataflows
Authors:
Moritz Hoffmann,
Andrea Lattuada,
Frank McSherry,
Vasiliki Kalavri,
John Liagouris,
Timothy Roscoe
Abstract:
We design and implement Megaphone, a data migration mechanism for stateful distributed dataflow engines with latency objectives. When compared to existing migration mechanisms, Megaphone has the following differentiating characteristics: (i) migrations can be subdivided to a configurable granularity to avoid latency spikes, and (ii) migrations can be prepared ahead of time to avoid runtime coordin…
▽ More
We design and implement Megaphone, a data migration mechanism for stateful distributed dataflow engines with latency objectives. When compared to existing migration mechanisms, Megaphone has the following differentiating characteristics: (i) migrations can be subdivided to a configurable granularity to avoid latency spikes, and (ii) migrations can be prepared ahead of time to avoid runtime coordination. Megaphone is implemented as a library on an unmodified timely dataflow implementation, and provides an operator interface compatible with its existing APIs. We evaluate Megaphone on established benchmarks with varying amounts of state and observe that compared to naïve approaches Megaphone reduces service latencies during reconfiguration by orders of magnitude without significantly increasing steady-state overhead.
△ Less
Submitted 16 April, 2019; v1 submitted 4 December, 2018;
originally announced December 2018.
-
DeltaPath: dataflow-based high-performance incremental routing
Authors:
Desislava Dimitrova,
John Liagouris,
Sebastian Wicki,
Moritz Hoffmann,
Vasiliki Kalavri,
Timothy Roscoe
Abstract:
Routing controllers must react quickly to failures, reconfigurations and workload or policy changes, to ensure service performance and cost-efficient network operation. We propose a general execution model which views routing as an incremental data-parallel computation on a graph-based network model plus a continuous stream of network changes. Our approach supports different routing objectives wit…
▽ More
Routing controllers must react quickly to failures, reconfigurations and workload or policy changes, to ensure service performance and cost-efficient network operation. We propose a general execution model which views routing as an incremental data-parallel computation on a graph-based network model plus a continuous stream of network changes. Our approach supports different routing objectives with only minor re-configuration of its core algorithm, and easily accomodates dynamic user-defined routing policies. Moreover, our prototype demonstrates excellent performance: on Google Jupiter topology it reacts with a median time of 350ms to link failures and serves more than two million path requests per second each with latency under 1ms. This is three orders-of-magnitude faster than the popular ONOS open-source SDN controller.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
graphVizdb: A Scalable Platform for Interactive Large Graph Visualization
Authors:
Nikos Bikakis,
John Liagouris,
Maria Krommyda,
George Papastefanatos,
Timos Sellis
Abstract:
We present a novel platform for the interactive visualization of very large graphs. The platform enables the user to interact with the visualized graph in a way that is very similar to the exploration of maps at multiple levels. Our approach involves an offline preprocessing phase that builds the layout of the graph by assigning coordinates to its nodes with respect to a Euclidean plane. The respe…
▽ More
We present a novel platform for the interactive visualization of very large graphs. The platform enables the user to interact with the visualized graph in a way that is very similar to the exploration of maps at multiple levels. Our approach involves an offline preprocessing phase that builds the layout of the graph by assigning coordinates to its nodes with respect to a Euclidean plane. The respective points are indexed with a spatial data structure, i.e., an R-tree, and stored in a database. Multiple abstraction layers of the graph based on various criteria are also created offline, and they are indexed similarly so that the user can explore the dataset at different levels of granularity, depending on her particular needs. Then, our system translates user operations into simple and very efficient spatial operations (i.e., window queries) in the backend. This technique allows for a fine-grained access to very large graphs with extremely low latency and memory requirements and without compromising the functionality of the tool. Our web-based prototype supports three main operations: (1) interactive navigation, (2) multi-level exploration, and (3) keyword search on the graph metadata.
△ Less
Submitted 20 February, 2016;
originally announced February 2016.
-
Towards Scalable Visual Exploration of Very Large RDF Graphs
Authors:
Nikos Bikakis,
John Liagouris,
Maria Krommyda,
George Papastefanatos,
Timos Sellis
Abstract:
In this paper, we outline our work on develo** a disk-based infrastructure for efficient visualization and graph exploration operations over very large graphs. The proposed platform, called graphVizdb, is based on a novel technique for indexing and storing the graph. Particularly, the graph layout is indexed with a spatial data structure, i.e., an R-tree, and stored in a database. In runtime, us…
▽ More
In this paper, we outline our work on develo** a disk-based infrastructure for efficient visualization and graph exploration operations over very large graphs. The proposed platform, called graphVizdb, is based on a novel technique for indexing and storing the graph. Particularly, the graph layout is indexed with a spatial data structure, i.e., an R-tree, and stored in a database. In runtime, user operations are translated into efficient spatial operations (i.e., window queries) in the backend.
△ Less
Submitted 16 June, 2015; v1 submitted 13 June, 2015;
originally announced June 2015.
-
Privacy Preservation by Disassociation
Authors:
Manolis Terrovitis,
John Liagouris,
Nikos Mamoulis,
Spiros Skiadopoulos
Abstract:
In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction…
▽ More
In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction between sensitive and non-sensitive information and sever the possible linkage. In many real world applications the above techniques are not applicable. For instance, consider web search query logs. Suppressing or generalizing anonymization methods would remove the most valuable information in the dataset: the original query terms. Additionally, web search query logs contain millions of query terms which cannot be categorized as sensitive or non-sensitive since a term may be sensitive for a user and non-sensitive for another. Motivated by this observation, we propose an anonymization technique termed disassociation that preserves the original terms but hides the fact that two or more different terms appear in the same record. We protect the users' privacy by disassociating record terms that participate in identifying combinations. This way the adversary cannot associate with high probability a record with a rare combination of terms. To the best of our knowledge, our proposal is the first to employ such a technique to provide protection against identity disclosure. We propose an anonymization algorithm based on our approach and evaluate its performance on real and synthetic datasets, comparing it against other state-of-the-art methods based on generalization and differential privacy.
△ Less
Submitted 30 June, 2012;
originally announced July 2012.