-
Collective Privacy Recovery: Data-sharing Coordination via Decentralized Artificial Intelligence
Authors:
Evangelos Pournaras,
Mark Christopher Ballandies,
Stefano Bennati,
Chien-fei Chen
Abstract:
Collective privacy loss becomes a colossal problem, an emergency for personal freedoms and democracy. But, are we prepared to handle personal data as scarce resource and collectively share data under the doctrine: as little as possible, as much as necessary? We hypothesize a significant privacy recovery if a population of individuals, the data collective, coordinates to share minimum data for runn…
▽ More
Collective privacy loss becomes a colossal problem, an emergency for personal freedoms and democracy. But, are we prepared to handle personal data as scarce resource and collectively share data under the doctrine: as little as possible, as much as necessary? We hypothesize a significant privacy recovery if a population of individuals, the data collective, coordinates to share minimum data for running online services with the required quality. Here we show how to automate and scale-up complex collective arrangements for privacy recovery using decentralized artificial intelligence. For this, we compare for first time attitudinal, intrinsic, rewarded and coordinated data sharing in a rigorous living-lab experiment of high realism involving >27,000 real data disclosures. Using causal inference and cluster analysis, we differentiate criteria predicting privacy and five key data-sharing behaviors. Strikingly, data-sharing coordination proves to be a win-win for all: remarkable privacy recovery for people with evident costs reduction for service providers.
△ Less
Submitted 10 July, 2023; v1 submitted 14 January, 2023;
originally announced January 2023.
-
Modelling imperfect knowledge via location semantics for realistic privacy risks estimation in trajectory data
Authors:
Stefano Bennati,
Aleksandra Kovacevic
Abstract:
Mobility patterns of vehicles and people provide powerful data sources for location-based services such as fleet optimization and traffic flow analysis. Location-based service providers must balance the value they extract from trajectory data with protecting the privacy of the individuals behind those trajectories. Reaching this goal requires measuring accurately the values of utility and privacy.…
▽ More
Mobility patterns of vehicles and people provide powerful data sources for location-based services such as fleet optimization and traffic flow analysis. Location-based service providers must balance the value they extract from trajectory data with protecting the privacy of the individuals behind those trajectories. Reaching this goal requires measuring accurately the values of utility and privacy. Current measurement approaches assume adversaries with perfect knowledge, thus overestimate the privacy risk. To address this issue we introduce a model of an adversary with imperfect knowledge about the target. The model is based on equivalence areas, spatio-temporal regions with a semantic meaning, e.g. the target's home, whose size and accuracy determine the skill of the adversary. We then derive the standard privacy metrics of k-anonymity, l-diversity and t-closeness from the definition of equivalence areas. These metrics can be computed on any dataset, irrespective of whether and what kind of anonymization has been applied to it. This work is of high relevance to all service providers acting as processors of trajectory data who want to manage privacy risks and optimize the privacy vs. utility trade-off of their services.
△ Less
Submitted 7 December, 2021; v1 submitted 18 November, 2020;
originally announced November 2020.
-
Volunteers in the Smart City: Comparison of Contribution Strategies on Human-Centered Measures
Authors:
Stefano Bennati,
Ivana Dusparic,
Rhythima Shinde,
Catholijn M. Jonker
Abstract:
Several smart city services rely on users contribution, e.g., data, which can be costly for the users in terms of privacy. High costs lead to reduced user participation, which undermine the success of smart city technologies. This work develops a scenario-independent design principle, based on public good theory, for resource management in smart city applications, where provision of a service depe…
▽ More
Several smart city services rely on users contribution, e.g., data, which can be costly for the users in terms of privacy. High costs lead to reduced user participation, which undermine the success of smart city technologies. This work develops a scenario-independent design principle, based on public good theory, for resource management in smart city applications, where provision of a service depends on contributors and free-riders, which benefit from the service without contributing own resources. Following this design principle, different classes of algorithms for resource management are evaluated with respect to human-centered measures, i.e., privacy, fairness and social welfare. Trade-offs that characterize algorithms are discussed across two smart city application scenarios. These results might help Smart City application designers to choose a suitable algorithm given a scenario-specific set of requirements, and users to choose a service based on an algorithm that matches their preferences.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
PriMaL: A Privacy-Preserving Machine Learning Method for Event Detection in Distributed Sensor Networks
Authors:
Stefano Bennati,
Catholijn M. Jonker
Abstract:
This paper introduces PriMaL, a general PRIvacy-preserving MAchine-Learning method for reducing the privacy cost of information transmitted through a network. Distributed sensor networks are often used for automated classification and detection of abnormal events in high-stakes situations, e.g. fire in buildings, earthquakes, or crowd disasters. Such networks might transmit privacy-sensitive infor…
▽ More
This paper introduces PriMaL, a general PRIvacy-preserving MAchine-Learning method for reducing the privacy cost of information transmitted through a network. Distributed sensor networks are often used for automated classification and detection of abnormal events in high-stakes situations, e.g. fire in buildings, earthquakes, or crowd disasters. Such networks might transmit privacy-sensitive information, e.g. GPS location of smartphones, which might be disclosed if the network is compromised. Privacy concerns might slow down the adoption of the technology, in particular in the scenario of social sensing where participation is voluntary, thus solutions are needed which improve privacy without compromising on the event detection accuracy. PriMaL is implemented as a machine-learning layer that works on top of an existing event detection algorithm. Experiments are run in a general simulation framework, for several network topologies and parameter values. The privacy footprint of state-of-the-art event detection algorithms is compared within the proposed framework. Results show that PriMaL is able to reduce the privacy cost of a distributed event detection algorithm below that of the corresponding centralized algorithm, within the bounds of some assumptions about the protocol. Moreover the performance of the distributed algorithm is not statistically worse than that of the centralized algorithm.
△ Less
Submitted 21 March, 2017;
originally announced March 2017.
-
Privacy-enhancing Aggregation of Internet of Things Data via Sensors Grou**
Authors:
Stefano Bennati,
Evangelos Pournaras
Abstract:
Big data collection practices using Internet of Things (IoT) pervasive technologies are often privacy-intrusive and result in surveillance, profiling, and discriminatory actions over citizens that in turn undermine the participation of citizens to the development of sustainable smart cities. Nevertheless, real-time data analytics and aggregate information from IoT devices open up tremendous opport…
▽ More
Big data collection practices using Internet of Things (IoT) pervasive technologies are often privacy-intrusive and result in surveillance, profiling, and discriminatory actions over citizens that in turn undermine the participation of citizens to the development of sustainable smart cities. Nevertheless, real-time data analytics and aggregate information from IoT devices open up tremendous opportunities for managing smart city infrastructures. The privacy-enhancing aggregation of distributed sensor data, such as residential energy consumption or traffic information, is the research focus of this paper. Citizens have the option to choose their privacy level by reducing the quality of the shared data at a cost of a lower accuracy in data analytics services. A baseline scenario is considered in which IoT sensor data are shared directly with an untrustworthy central aggregator. A grou** mechanism is introduced that improves privacy by sharing data aggregated first at a group level compared as opposed to sharing data directly to the central aggregator. Group-level aggregation obfuscates sensor data of individuals, in a similar fashion as differential privacy and homomorphic encryption schemes, thus inference of privacy-sensitive information from single sensors becomes computationally harder compared to the baseline scenario. The proposed system is evaluated using real-world data from two smart city pilot projects. Privacy under grou** increases, while preserving the accuracy of the baseline scenario. Intra-group influences of privacy by one group member on the other ones are measured and fairness on privacy is found to be maximized between group members with similar privacy choices. Several grou** strategies are compared. Grou** by proximity of privacy choices provides the highest privacy gains. The implications of the strategy on the design of incentives mechanisms are discussed.
△ Less
Submitted 1 March, 2018; v1 submitted 28 February, 2017;
originally announced February 2017.
-
On the Role of Collective Sensing and Evolution in Group Formation
Authors:
Stefano Bennati
Abstract:
Collective sensing is an emergent phenomenon which enables individuals to estimate a hidden property of the environment through the observation of social interactions. Previous work on collective sensing shows that gregarious individuals obtain an evolutionary advantage by exploiting collective sensing when competing against solitary individuals. This work addresses the question of whether collect…
▽ More
Collective sensing is an emergent phenomenon which enables individuals to estimate a hidden property of the environment through the observation of social interactions. Previous work on collective sensing shows that gregarious individuals obtain an evolutionary advantage by exploiting collective sensing when competing against solitary individuals. This work addresses the question of whether collective sensing allows for the emergence of groups from a population of individuals without predetermined behaviors. It is assumed that group membership does not lessen competition on the limited resources in the environment, e.g. groups do not improve foraging efficiency. Experiments are run in an agent-based evolutionary model of a foraging task, where the fitness of the agents depends on their foraging strategy. The foraging strategy of agents is determined by a neural network, which does not require explicit modeling of the environment and of the interactions between agents. Experiments demonstrate that gregarious behavior is not the evolutionary-fittest strategy if resources are abundant, thus invalidating previous findings in a specific region of the parameter space. In other words, resource scarcity makes gregarious behavior so valuable as to make up for the increased competition over the few available resources. Furthermore, it is shown that a population of solitary agents can evolve gregarious behavior in response to a sudden scarcity of resources, thus individuating a possible mechanism that leads to gregarious behavior in nature. The evolutionary process operates on the whole parameter space of the neural networks, hence these behaviors are selected among an unconstrained set of behavioral models.
△ Less
Submitted 15 February, 2018; v1 submitted 22 February, 2016;
originally announced February 2016.
-
Malware Task Identification: A Data Driven Approach
Authors:
Eric Nunes,
Casey Buto,
Paulo Shakarian,
Christian Lebiere,
Stefano Bennati,
Robert Thomson,
Holger Jaenisch
Abstract:
Identifying the tasks a given piece of malware was designed to perform (e.g. logging keystrokes, recording video, establishing remote access, etc.) is a difficult and time-consuming operation that is largely human-driven in practice. In this paper, we present an automated method to identify malware tasks. Using two different malware collections, we explore various circumstances for each - includin…
▽ More
Identifying the tasks a given piece of malware was designed to perform (e.g. logging keystrokes, recording video, establishing remote access, etc.) is a difficult and time-consuming operation that is largely human-driven in practice. In this paper, we present an automated method to identify malware tasks. Using two different malware collections, we explore various circumstances for each - including cases where the training data differs significantly from test; where the malware being evaluated employs packing to thwart analytical techniques; and conditions with sparse training data. We find that this approach consistently out-performs the current state-of-the art software for malware task identification as well as standard machine learning approaches - often achieving an unbiased F1 score of over 0.9. In the near future, we look to deploy our approach for use by analysts in an operational cyber-security environment.
△ Less
Submitted 7 July, 2015;
originally announced July 2015.