Search | arXiv e-print repository

Metric entropy of causal, discrete-time LTI systems

Authors: Clemens Hutter, Thomas Allard, Helmut Bölcskei

Abstract: In [1] it is shown that recurrent neural networks (RNNs) can learn - in a metric entropy optimal manner - discrete time, linear time-invariant (LTI) systems. This is effected by comparing the number of bits needed to encode the approximating RNN to the metric entropy of the class of LTI systems under consideration [2, 3]. The purpose of this note is to provide an elementary self-contained proof of… ▽ More In [1] it is shown that recurrent neural networks (RNNs) can learn - in a metric entropy optimal manner - discrete time, linear time-invariant (LTI) systems. This is effected by comparing the number of bits needed to encode the approximating RNN to the metric entropy of the class of LTI systems under consideration [2, 3]. The purpose of this note is to provide an elementary self-contained proof of the metric entropy results in [2, 3], in the process of which minor mathematical issues appearing in [2, 3] are cleaned up. These corrections also lead to the correction of a constant in a result in [1] (see Remark 2.5). △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: [1] arXiv:2105.02556

arXiv:2211.07205 [pdf, other]

Unique in the Smart Grid -The Privacy Cost of Fine-Grained Electrical Consumption Data

Authors: Antonin Voyez, Tristan Allard, Gildas Avoine, Pierre Cauchois, Elisa Fromont, Matthieu Simonin

Abstract: The collection of electrical consumption time series through smart meters grows with ambitious nationwide smart grid programs. This data is both highly sensitive and highly valuable: strong laws about personal data protect it while laws about open data aim at making it public after a privacy-preserving data publishing process. In this work, we study the uniqueness of large scale real-life fine-gra… ▽ More The collection of electrical consumption time series through smart meters grows with ambitious nationwide smart grid programs. This data is both highly sensitive and highly valuable: strong laws about personal data protect it while laws about open data aim at making it public after a privacy-preserving data publishing process. In this work, we study the uniqueness of large scale real-life fine-grained electrical consumption time-series and show its link to privacy threats. Our results show a worryingly high uniqueness rate in such datasets. In particular, we show that knowing 5 consecutive electric measures allows to re-identify on average more than 90% of households in our 2.5M half-hourly electric time series dataset. Moreover, uniqueness remains high even when data is severely degraded. For example, when data is rounded to the nearest 100 watts, knowing 7 consecutive electric measures allows to re-identify on average more than 40% of the households (same dataset). We also study the relationship between uniqueness and entropy, uniqueness and electric consumption, and electric consumption and temperatures, showing their strong correlation. △ Less

Submitted 14 November, 2022; originally announced November 2022.

arXiv:2104.09175 [pdf, other]

doi 10.1145/3442442.3458610

BrFAST: a Tool to Select Browser Fingerprinting Attributes for Web Authentication According to a Usability-Security Trade-off

Authors: Nampoina Andriamilanto, Tristan Allard

Abstract: In this demonstration, we put ourselves in the place of a website manager who seeks to use browser fingerprinting for web authentication. The first step is to choose the attributes to implement among the hundreds that are available. To do so, we developed BrFAST, an attribute selection platform that includes FPSelect, an algorithm that rigorously selects the attributes according to a trade-off bet… ▽ More In this demonstration, we put ourselves in the place of a website manager who seeks to use browser fingerprinting for web authentication. The first step is to choose the attributes to implement among the hundreds that are available. To do so, we developed BrFAST, an attribute selection platform that includes FPSelect, an algorithm that rigorously selects the attributes according to a trade-off between security and usability. BrFAST is configured with a set of parameters for which we provide values for BrFAST to be usable as is. We notably include the resources to use two publicly available browser fingerprint datasets. BrFAST can be extended to use other parameters: other attribute selection methods, other measures of security and usability, or other fingerprint datasets. BrFAST helps visualize the exploration of the possibilities during the search of the best attributes to use, compare the properties of attribute sets, and compare several attribute selection methods. During the demonstration, we compare the attributes selected by FPSelect with these selected by the usual methods according to the properties of the resulting browser fingerprints (e.g., their usability, their unicity). △ Less

Submitted 19 April, 2021; originally announced April 2021.

arXiv:2010.06404 [pdf, other]

doi 10.1145/3427228.3427297

FPSelect: Low-Cost Browser Fingerprints for Mitigating Dictionary Attacks against Web Authentication Mechanisms

Authors: Nampoina Andriamilanto, Tristan Allard, Gaëtan Le Guelvouit

Abstract: Browser fingerprinting consists into collecting attributes from a web browser. Hundreds of attributes have been discovered through the years. Each one of them provides a way to distinguish browsers, but also comes with a usability cost (e.g., additional collection time). In this work, we propose FPSelect, an attribute selection framework allowing verifiers to tune their browser fingerprinting prob… ▽ More Browser fingerprinting consists into collecting attributes from a web browser. Hundreds of attributes have been discovered through the years. Each one of them provides a way to distinguish browsers, but also comes with a usability cost (e.g., additional collection time). In this work, we propose FPSelect, an attribute selection framework allowing verifiers to tune their browser fingerprinting probes for web authentication. We formalize the problem as searching for the attribute set that satisfies a security requirement and minimizes the usability cost. The security is measured as the proportion of impersonated users given a fingerprinting probe, a user population, and an attacker that knows the exact fingerprint distribution among the user population. The usability is quantified by the collection time of browser fingerprints, their size, and their instability. We compare our framework with common baselines, based on a real-life fingerprint dataset, and find out that in our experimental settings, our framework selects attribute sets of lower usability cost. Compared to the baselines, the attribute sets found by FPSelect generate fingerprints that are up to 97 times smaller, are collected up to 3,361 times faster, and with up to 7.2 times less changing attributes between two observations, on average. △ Less

Submitted 13 October, 2020; originally announced October 2020.

arXiv:2007.05373 [pdf, other]

From Task Tuning to Task Assignment in Privacy-Preserving Crowdsourcing Platforms

Authors: Joris Duguépéroux, Tristan Allard

Abstract: Specialized worker profiles of crowdsourcing platforms may contain a large amount of identifying and possibly sensitive personal information (e.g., personal preferences, skills, available slots, available devices) raising strong privacy concerns. This led to the design of privacy-preserving crowdsourcing platforms, that aim at enabling efficient crowd-sourcing processes while providing strong priv… ▽ More Specialized worker profiles of crowdsourcing platforms may contain a large amount of identifying and possibly sensitive personal information (e.g., personal preferences, skills, available slots, available devices) raising strong privacy concerns. This led to the design of privacy-preserving crowdsourcing platforms, that aim at enabling efficient crowd-sourcing processes while providing strong privacy guarantees even when the platform is not fully trusted. In this paper, we propose two contributions. First, we propose the PKD algorithm with the goal of supporting a large variety of aggregate usages of worker profiles within a privacy-preserving crowdsourcing platform. The PKD algorithm combines together homomorphic encryption and differential privacy for computing (perturbed) partitions of the multi-dimensional space of skills of the actual population of workers and a (perturbed) COUNT of workers per partition. Second, we propose to benefit from recent progresses in Private Information Retrieval techniques in order to design a solution to task assignment that is both private and affordable. We perform an in-depth study of the problem of using PIR techniques for proposing tasks to workers, show that it is NP-Hard, and come up with the PKD PIR Packing heuristic that groups tasks together according to the partitioning output by the PKD algorithm. In a nutshell, we design the PKD algorithm and the PKD PIR Packing heuristic, we prove formally their security against honest-but-curious workers and/or platform, we analyze their complexities, and we demonstrate their quality and affordability in real-life scenarios through an extensive experimental evaluation performed over both synthetic and realistic datasets. △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:2007.01688 [pdf, other]

Online publication of court records: circumventing the privacy-transparency trade-off

Authors: Tristan Allard, Louis Béziaud, Sébastien Gambs

Abstract: The open data movement is leading to the massive publishing of court records online, increasing transparency and accessibility of justice, and to the design of legal technologies building on the wealth of legal data available. However, the sensitive nature of legal decisions also raises important privacy issues. Current practices solve the resulting privacy versus transparency trade-off by combini… ▽ More The open data movement is leading to the massive publishing of court records online, increasing transparency and accessibility of justice, and to the design of legal technologies building on the wealth of legal data available. However, the sensitive nature of legal decisions also raises important privacy issues. Current practices solve the resulting privacy versus transparency trade-off by combining access control with (manual or semi-manual) text redaction. In this work, we claim that current practices are insufficient for co** with massive access to legal data (restrictive access control policies is detrimental to openness and to utility while text redaction is unable to provide sound privacy protection) and advocate for a in-tegrative approach that could benefit from the latest developments of the privacy-preserving data publishing domain. We present a thorough analysis of the problem and of the current approaches, and propose a straw man multimodal architecture paving the way to a full-fledged privacy-preserving legal data publishing system. △ Less

Submitted 3 July, 2020; originally announced July 2020.

arXiv:2006.09511 [pdf, other]

doi 10.1145/3478026

A Large-scale Empirical Analysis of Browser Fingerprints Properties for Web Authentication

Authors: Nampoina Andriamilanto, Tristan Allard, Gaëtan Le Guelvouit, Alexandre Garel

Abstract: Modern browsers give access to several attributes that can be collected to form a browser fingerprint. Although browser fingerprints have primarily been studied as a web tracking tool, they can contribute to improve the current state of web security by augmenting web authentication mechanisms. In this paper, we investigate the adequacy of browser fingerprints for web authentication. We make the li… ▽ More Modern browsers give access to several attributes that can be collected to form a browser fingerprint. Although browser fingerprints have primarily been studied as a web tracking tool, they can contribute to improve the current state of web security by augmenting web authentication mechanisms. In this paper, we investigate the adequacy of browser fingerprints for web authentication. We make the link between the digital fingerprints that distinguish browsers, and the biological fingerprints that distinguish Humans, to evaluate browser fingerprints according to properties inspired by biometric authentication factors. These properties include their distinctiveness, their stability through time, their collection time, their size, and the accuracy of a simple verification mechanism. We assess these properties on a large-scale dataset of 4,145,408 fingerprints composed of 216 attributes, and collected from 1,989,365 browsers. We show that, by time-partitioning our dataset, more than 81.3% of our fingerprints are shared by a single browser. Although browser fingerprints are known to evolve, an average of 91% of the attributes of our fingerprints stay identical between two observations, even when separated by nearly 6 months. About their performance, we show that our fingerprints weigh a dozen of kilobytes, and take a few seconds to collect. Finally, by processing a simple verification mechanism, we show that it achieves an equal error rate of 0.61%. We enrich our results with the analysis of the correlation between the attributes, and of their contribution to the evaluated properties. We conclude that our browser fingerprints carry the promise to strengthen web authentication mechanisms. △ Less

Submitted 3 October, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Journal ref: ACM Transactions on the Web, Volume 16, Issue 1, February 2022, Article 4, pp 1 62

arXiv:2005.09353 [pdf, other]

doi 10.1007/978-3-030-50399-4_16

"Guess Who ?" Large-Scale Data-Centric Study of the Adequacy of Browser Fingerprints for Web Authentication

Authors: Nampoina Andriamilanto, Tristan Allard, Gaëtan Le Guelvouit

Abstract: Browser fingerprinting consists in collecting attributes from a web browser to build a browser fingerprint. In this work, we assess the adequacy of browser fingerprints as an authentication factor, on a dataset of 4,145,408 fingerprints composed of 216 attributes. It was collected throughout 6 months from a population of general browsers. We identify, formalize, and assess the properties for brows… ▽ More Browser fingerprinting consists in collecting attributes from a web browser to build a browser fingerprint. In this work, we assess the adequacy of browser fingerprints as an authentication factor, on a dataset of 4,145,408 fingerprints composed of 216 attributes. It was collected throughout 6 months from a population of general browsers. We identify, formalize, and assess the properties for browser fingerprints to be usable and practical as an authentication factor. We notably evaluate their distinctiveness, their stability through time, their collection time, and their size in memory. We show that considering a large surface of 216 fingerprinting attributes leads to an unicity rate of 81% on a population of 1,989,365 browsers. Moreover, browser fingerprints are known to evolve, but we observe that between consecutive fingerprints, more than 90% of the attributes remain unchanged after nearly 6 months. Fingerprints are also affordable. On average, they weigh a dozen of kilobytes, and are collected in a few seconds. We conclude that browser fingerprints are a promising additional web authentication factor. △ Less

Submitted 22 June, 2021; v1 submitted 19 May, 2020; originally announced May 2020.

arXiv:2005.01038 [pdf, other]

SEPAR: Towards Regulating Future of Work Multi-Platform Crowdworking Environments with Privacy Guarantees

Authors: Mohammad Javad Amiri, Joris Duguépéroux, Tristan Allard, Divyakant Agrawal, Amr El Abbadi

Abstract: Crowdworking platforms provide the opportunity for diverse workers to execute tasks for different requesters. The popularity of the "gig" economy has given rise to independent platforms that provide competing and complementary services. Workers as well as requesters with specific tasks may need to work for or avail from the services of multiple platforms resulting in the rise of multi-platform cro… ▽ More Crowdworking platforms provide the opportunity for diverse workers to execute tasks for different requesters. The popularity of the "gig" economy has given rise to independent platforms that provide competing and complementary services. Workers as well as requesters with specific tasks may need to work for or avail from the services of multiple platforms resulting in the rise of multi-platform crowdworking systems. Recently, there has been increasing interest by governmental, legal and social institutions to enforce regulations, such as minimal and maximal work hours, on crowdworking platforms. Platforms within multi-platform crowdworking systems, therefore, need to collaborate to enforce cross-platform regulations. While collaborating to enforce global regulations requires the transparent sharing of information about tasks and their participants, the privacy of all participants needs to be preserved. In this paper, we propose an overall vision exploring the regulation, privacy, and architecture dimensions for the future of work multi-platform crowdworking environments. We then present SEPAR, a multi-platform crowdworking system that enforces a large sub-space of practical global regulations on a set of distributed independent platforms in a privacy-preserving manner. SEPAR, enforces privacy using lightweight and anonymous tokens, while transparency is achieved using fault-tolerant blockchains shared across multiple platforms. The privacy guarantees of SEPAR against covert adversaries are formalized and thoroughly demonstrated, while the experiments reveal the efficiency of SEPAR in terms of performance and scalability. △ Less

Submitted 21 October, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

Showing 1–9 of 9 results for author: Allard, T