-
De-DSI: Decentralised Differentiable Search Index
Authors:
Petru Neague,
Marcel Gregoriadis,
Johan Pouwelse
Abstract:
This study introduces De-DSI, a novel framework that fuses large language models (LLMs) with genuine decentralization for information retrieval, particularly employing the differentiable search index (DSI) concept in a decentralized setting. Focused on efficiently connecting novel user queries with document identifiers without direct document access, De-DSI operates solely on query-docid pairs. To…
▽ More
This study introduces De-DSI, a novel framework that fuses large language models (LLMs) with genuine decentralization for information retrieval, particularly employing the differentiable search index (DSI) concept in a decentralized setting. Focused on efficiently connecting novel user queries with document identifiers without direct document access, De-DSI operates solely on query-docid pairs. To enhance scalability, an ensemble of DSI models is introduced, where the dataset is partitioned into smaller shards for individual model training. This approach not only maintains accuracy by reducing the number of data each model needs to handle but also facilitates scalability by aggregating outcomes from multiple models. This aggregation uses a beam search to identify top docids and applies a softmax function for score normalization, selecting documents with the highest scores for retrieval. The decentralized implementation demonstrates that retrieval success is comparable to centralized methods, with the added benefit of the possibility of distributing computational complexity across the network. This setup also allows for the retrieval of multimedia items through magnet links, eliminating the need for platforms or intermediaries.
△ Less
Submitted 19 April, 2024; v1 submitted 18 April, 2024;
originally announced April 2024.
-
Failures of public key infrastructure: 53 year survey
Authors:
Adrian-Tudor Dumitrescu,
Johan Pouwelse
Abstract:
The Public Key Infrastructure existed in critical infrastructure systems since the expansion of the World Wide Web, but to this day its limitations have not been completely solved. With the rise of government-driven digital identity in Europe, it is more important than ever to understand how PKI can be an efficient frame for eID and to learn from mistakes encountered by other countries in such cri…
▽ More
The Public Key Infrastructure existed in critical infrastructure systems since the expansion of the World Wide Web, but to this day its limitations have not been completely solved. With the rise of government-driven digital identity in Europe, it is more important than ever to understand how PKI can be an efficient frame for eID and to learn from mistakes encountered by other countries in such critical systems. This survey aims to analyze the literature on the problems and risks that PKI exhibits, establish a brief timeline of its evolution in the last decades and study how it was implemented in digital identity projects.
△ Less
Submitted 11 January, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
Mass Adoption of NATs: Survey and experiments on carrier-grade NATs
Authors:
Orestis Kanaris,
Johan Pouwelse
Abstract:
In recent times, the prevalence of home NATs and the widespread implementation of Carrier-Grade NATs have posed significant challenges to various applications, particularly those relying on Peer-to-Peer communication. This paper addresses these issues by conducting a thorough review of related literature and exploring potential techniques to mitigate the problems. The literature review focuses on…
▽ More
In recent times, the prevalence of home NATs and the widespread implementation of Carrier-Grade NATs have posed significant challenges to various applications, particularly those relying on Peer-to-Peer communication. This paper addresses these issues by conducting a thorough review of related literature and exploring potential techniques to mitigate the problems. The literature review focuses on the disruptive effects of home NATs and CGNATs on application performance. Additionally, the study examines existing approaches used to alleviate these disruptions. Furthermore, this paper presents a comprehensive guide on how to puncture a NAT and facilitate direct communication between two peers behind any type of NAT. The techniques outlined in the guide are rigorously tested using a simple application running the IPv8 network overlay, along with their built-in NAT penetration procedures. To evaluate the effectiveness of the proposed techniques, 5G communication is established between two phones using four different Dutch telephone carriers. The results indicate successful cross-connectivity with three out of the four carriers tested, showcasing the practical applicability of the suggested methods.
△ Less
Submitted 15 November, 2023; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Augmenting LLMs with Knowledge: A survey on hallucination prevention
Authors:
Konstantinos Andriopoulos,
Johan Pouwelse
Abstract:
Large pre-trained language models have demonstrated their proficiency in storing factual knowledge within their parameters and achieving remarkable results when fine-tuned for downstream natural language processing tasks. Nonetheless, their capacity to access and manipulate knowledge with precision remains constrained, resulting in performance disparities on knowledge-intensive tasks when compared…
▽ More
Large pre-trained language models have demonstrated their proficiency in storing factual knowledge within their parameters and achieving remarkable results when fine-tuned for downstream natural language processing tasks. Nonetheless, their capacity to access and manipulate knowledge with precision remains constrained, resulting in performance disparities on knowledge-intensive tasks when compared to task-specific architectures. Additionally, the challenges of providing provenance for model decisions and maintaining up-to-date world knowledge persist as open research frontiers. To address these limitations, the integration of pre-trained models with differentiable access mechanisms to explicit non-parametric memory emerges as a promising solution. This survey delves into the realm of language models (LMs) augmented with the ability to tap into external knowledge sources, including external knowledge bases and search engines. While adhering to the standard objective of predicting missing tokens, these augmented LMs leverage diverse, possibly non-parametric external modules to augment their contextual processing capabilities, departing from the conventional language modeling paradigm. Through an exploration of current advancements in augmenting large language models with knowledge, this work concludes that this emerging research direction holds the potential to address prevalent issues in traditional LMs, such as hallucinations, un-grounded responses, and scalability challenges.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Sustainable Cooperation in Peer-To-Peer Networks
Authors:
Bulat Nasrulin,
Rowdy Chotkan,
Johan Pouwelse
Abstract:
Traditionally, peer-to-peer systems have relied on altruism and reciprocity. Although incentive-based models have gained prominence in new-generation peer-to-peer systems, it is essential to recognize the continued importance of cooperative principles in achieving performance, fairness, and correctness. The lack of this acknowledgment has paved the way for selfish peers to gain unfair advantages i…
▽ More
Traditionally, peer-to-peer systems have relied on altruism and reciprocity. Although incentive-based models have gained prominence in new-generation peer-to-peer systems, it is essential to recognize the continued importance of cooperative principles in achieving performance, fairness, and correctness. The lack of this acknowledgment has paved the way for selfish peers to gain unfair advantages in these systems. As such, we address the challenge of selfish peers by devising a mechanism to reward sustained cooperation. Instead of relying on global accountability mechanisms, we propose a protocol that naturally aggregates local evaluations of cooperation. Traditional mechanisms are often vulnerable to Sybil and misreporting attacks. However, our approach overcomes these issues by limiting the benefits selfish peers can gain without incurring any cost. The viability of our algorithm is proven with a deployment to 27,259 Internet users and a realistic simulation of a blockchain gossip protocol. We show that our protocol sustains cooperation even in the presence of a majority of selfish peers while incurring only negligible overhead.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
LØ: An Accountable Mempool for MEV Resistance
Authors:
Bulat Nasrulin,
Georgy Ishmaev,
Jérémie Decouchant,
Johan Pouwelse
Abstract:
Possible manipulation of user transactions by miners in a permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue, known as Miner Extractable Value (MEV), incurs highs costs on users of decentralised applications. Furthermore, transaction manipulations create other issues in blockchain systems such as congestion, higher fees, and system instability. D…
▽ More
Possible manipulation of user transactions by miners in a permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue, known as Miner Extractable Value (MEV), incurs highs costs on users of decentralised applications. Furthermore, transaction manipulations create other issues in blockchain systems such as congestion, higher fees, and system instability. Detecting transaction manipulations is difficult, even though it is known that they originate from the pre-consensus phase of transaction selection for a block building, at the base layer of blockchain protocols. In this paper we summarize known transaction manipulation attacks. We then present LØ, an accountable base layer protocol specifically designed to detect and mitigate transaction manipulations. LØ is built around accurate detection of transaction manipulations and assignment of blame at the granularity of a single mining node. LØ forces miners to log all the transactions they receive into a secure mempool data structure and to process them in a verifiable manner. Overall, LØ quickly and efficiently detects reordering, injection or censorship attempts. Our performance evaluation shows that LØ is also practical and only introduces a marginal performance overhead.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Web3Recommend: Decentralised recommendations with trust and relevance
Authors:
Rohan Madhwal,
Johan Pouwelse
Abstract:
Web3Recommend is a decentralized Social Recommender System implementation that enables Web3 Platforms on Android to generate recommendations that balance trust and relevance. Generating recommendations in decentralized networks is a non-trivial problem because these networks lack a global perspective due to the absence of a central authority. Further, decentralized networks are prone to Sybil Atta…
▽ More
Web3Recommend is a decentralized Social Recommender System implementation that enables Web3 Platforms on Android to generate recommendations that balance trust and relevance. Generating recommendations in decentralized networks is a non-trivial problem because these networks lack a global perspective due to the absence of a central authority. Further, decentralized networks are prone to Sybil Attacks in which a single malicious user can generate multiple fake or Sybil identities. Web3Recommend relies on a novel graph-based content recommendation design inspired by GraphJet, a recommendation system used in Twitter enhanced with MeritRank, a decentralized reputation scheme that provides Sybil-resistance to the system. By adding MeritRank's decay parameters to the vanilla Social Recommender Systems' personalized SALSA graph algorithm, we can provide theoretical guarantees against Sybil Attacks in the generated recommendations. Similar to GraphJet, we focus on generating real-time recommendations by only acting on recent interactions in the social network, allowing us to cater temporally contextual recommendations while kee** a tight bound on the memory usage in resource-constrained devices, allowing for a seamless user experience. As a proof-of-concept, we integrate our system with MusicDAO, an open-source Web3 music-sharing platform, to generate personalized, real-time recommendations. Thus, we provide the first Sybil-resistant Social Recommender System, allowing real-time recommendations beyond classic user-based collaborative filtering. The system is also rigorously tested with extensive unit and integration tests. Further, our experiments demonstrate the trust-relevance balance of recommendations against multiple adversarial strategies in a test network generated using data from real music platforms.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Towards Sybil Resilience in Decentralized Learning
Authors:
Thomas Werthenbach,
Johan Pouwelse
Abstract:
Federated learning is a privacy-enforcing machine learning technology but suffers from limited scalability. This limitation mostly originates from the internet connection and memory capacity of the central parameter server, and the complexity of the model aggregation function. Decentralized learning has recently been emerging as a promising alternative to federated learning. This novel technology…
▽ More
Federated learning is a privacy-enforcing machine learning technology but suffers from limited scalability. This limitation mostly originates from the internet connection and memory capacity of the central parameter server, and the complexity of the model aggregation function. Decentralized learning has recently been emerging as a promising alternative to federated learning. This novel technology eliminates the need for a central parameter server by decentralizing the model aggregation across all participating nodes. Numerous studies have been conducted on improving the resilience of federated learning against poisoning and Sybil attacks, whereas the resilience of decentralized learning remains largely unstudied. This research gap serves as the main motivator for this study, in which our objective is to improve the Sybil poisoning resilience of decentralized learning.
We present SybilWall, an innovative algorithm focused on increasing the resilience of decentralized learning against targeted Sybil poisoning attacks. By combining a Sybil-resistant aggregation function based on similarity between Sybils with a novel probabilistic gossi** mechanism, we establish a new benchmark for scalable, Sybil-resilient decentralized learning.
A comprehensive empirical evaluation demonstrated that SybilWall outperforms existing state-of-the-art solutions designed for federated learning scenarios and is the only algorithm to obtain consistent accuracy over a range of adversarial attack scenarios. We also found SybilWall to diminish the utility of creating many Sybils, as our evaluations demonstrate a higher success rate among adversaries employing fewer Sybils. Finally, we suggest a number of possible improvements to SybilWall and highlight promising future research directions.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Decentralized Learning Made Practical with Client Sampling
Authors:
Martijn de Vos,
Akash Dhasade,
Anne-Marie Kermarrec,
Erick Lavoie,
Johan Pouwelse,
Rishi Sharma
Abstract:
Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since training data never leaves the device. In a round of DL, all nodes participate in model training and exchange their model with some other nodes. Performing DL in la…
▽ More
Decentralized learning (DL) leverages edge devices for collaborative model training while avoiding coordination by a central server. Due to privacy concerns, DL has become an attractive alternative to centralized learning schemes since training data never leaves the device. In a round of DL, all nodes participate in model training and exchange their model with some other nodes. Performing DL in large-scale heterogeneous networks results in high communication costs and prolonged round durations due to slow nodes, effectively inflating the total training time. Furthermore, current DL algorithms also assume all nodes are available for training and aggregation at all times, diminishing the practicality of DL. This paper presents Plexus, an efficient, scalable, and practical DL system. Plexus (1) avoids network-wide participation by introducing a decentralized peer sampler that selects small subsets of available nodes that train the model each round and, (2) aggregates the trained models produced by nodes every round. Plexus is designed to handle joining and leaving nodes (churn). We extensively evaluate Plexus by incorporating realistic traces for compute speed, pairwise latency, network capacity, and availability of edge devices in our experiments. Our experiments on four common learning tasks empirically show that Plexus reduces time-to-accuracy by 1.2-8.3x, communication volume by 2.4-15.3x and training resources needed for convergence by 6.4-370x compared to baseline DL algorithms.
△ Less
Submitted 7 May, 2024; v1 submitted 27 February, 2023;
originally announced February 2023.
-
G-Rank: Unsupervised Continuous Learn-to-Rank for Edge Devices in a P2P Network
Authors:
Andrew Gold,
Johan Pouwelse
Abstract:
Ranking algorithms in traditional search engines are powered by enormous training data sets that are meticulously engineered and curated by a centralized entity. Decentralized peer-to-peer (p2p) networks such as torrenting applications and Web3 protocols deliberately eschew centralized databases and computational architectures when designing services and features. As such, robust search-and-rank a…
▽ More
Ranking algorithms in traditional search engines are powered by enormous training data sets that are meticulously engineered and curated by a centralized entity. Decentralized peer-to-peer (p2p) networks such as torrenting applications and Web3 protocols deliberately eschew centralized databases and computational architectures when designing services and features. As such, robust search-and-rank algorithms designed for such domains must be engineered specifically for decentralized networks, and must be lightweight enough to operate on consumer-grade personal devices such as a smartphone or laptop computer. We introduce G-Rank, an unsupervised ranking algorithm designed exclusively for decentralized networks. We demonstrate that accurate, relevant ranking results can be achieved in fully decentralized networks without any centralized data aggregation, feature engineering, or model training. Furthermore, we show that such results are obtainable with minimal data preprocessing and computational overhead, and can still return highly relevant results even when a user's device is disconnected from the network. G-Rank is highly modular in design, is not limited to categorical data, and can be implemented in a variety of domains with minimal modification. The results herein show that unsupervised ranking models designed for decentralized p2p networks are not only viable, but worthy of further research.
△ Less
Submitted 29 January, 2023;
originally announced January 2023.
-
The Universal Trust Machine: A survey on the Web3 path towards enabling long term digital cooperation through decentralised trust
Authors:
Rohan Madhwal,
Johan Pouwelse
Abstract:
Since the dawn of human civilization, trust has been the core challenge of social organization. Trust functions to reduce the effort spent in constantly monitoring others' actions in order to verify their assertions, thus facilitating cooperation by allowing groups to function with reduced complexity. To date, in modern societies, large scale trust is almost exclusively provided by large centraliz…
▽ More
Since the dawn of human civilization, trust has been the core challenge of social organization. Trust functions to reduce the effort spent in constantly monitoring others' actions in order to verify their assertions, thus facilitating cooperation by allowing groups to function with reduced complexity. To date, in modern societies, large scale trust is almost exclusively provided by large centralized institutions. Specifically in the case of the Internet, Big Tech companies maintain the largest Internet platforms where users can interact, transact and share information. Thus, they control who can interact and conduct transactions through their monopoly of online trust. However, as recent events have shown, allowing for-profit corporations to act as gatekeepers to the online world comes with a litany of problems. While so far ecosystems of trust on the Internet could only be feasibly created by large institutions, Web3 proponents have a vision of the Internet where trust is generated without centralised actors. They attempt to do so by creating an ecosystem of trust constructed using decentralised technology. This survey explores this elusive goal of Web3 to create a "Universal Trust Machine", which in a true decentralised paradigm would be owned by both nobody and everybody. In order to do so, we first motivate the decades-old problem of generating trust without an intermediary by discussing Robert Axelrod's research on the evolution of cooperation. Next, we present the challenges that would have to be overcome in order to enable long term cooperation. We proceed to present various reputation systems, all of which present promising techniques for encouraging trustworthy behaviour. Then, we discuss Distributed Ledger technologies whose secure transaction facilitating and privacy preserving techniques promise to be a good complement to the current limitations of vanilla reputation systems.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
A Deployment-First Methodology to Mechanism Design and Refinement in Distributed Systems
Authors:
Martijn de Vos,
Georgy Ishmaev,
Johan Pouwelse,
Stefanie Roos
Abstract:
Catalyzed by the popularity of blockchain technology, there has recently been a renewed interest in the design, implementation and evaluation of decentralized systems. Most of these systems are intended to be deployed at scale and in heterogeneous environments with real users and unpredictable workloads. Nevertheless, most research in this field evaluates such systems in controlled environments th…
▽ More
Catalyzed by the popularity of blockchain technology, there has recently been a renewed interest in the design, implementation and evaluation of decentralized systems. Most of these systems are intended to be deployed at scale and in heterogeneous environments with real users and unpredictable workloads. Nevertheless, most research in this field evaluates such systems in controlled environments that poorly reflect the complex conditions of real-world environments. In this work, we argue that deployment is crucial to understanding decentralized mechanisms in a real-world environment and an enabler to building more robust and sustainable systems. We highlight the merits of deployment by comparing this approach with other experimental setups and show how our lab applied a deployment-first methodology. We then outline how we use Tribler, our peer-to-peer file-sharing application, to deploy and monitor decentralized mechanisms at scale. We illustrate the application of our methodology by describing a deployment trial in experimental tokenomics. Finally, we summarize four lessons learned from multiple deployment trials where we applied our methodology.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
Survey on social reputation mechanisms: Someone told me I can trust you
Authors:
Thomas Werthenbach,
Johan Pouwelse
Abstract:
Nowadays, most business and social interactions have moved to the internet, highlighting the relevance of creating online trust. One way to obtain a measure of trust is through reputation mechanisms, which record one's past performance and interactions to generate a reputational value. We observe that numerous existing reputation mechanisms share similarities with actual social phenomena; we call…
▽ More
Nowadays, most business and social interactions have moved to the internet, highlighting the relevance of creating online trust. One way to obtain a measure of trust is through reputation mechanisms, which record one's past performance and interactions to generate a reputational value. We observe that numerous existing reputation mechanisms share similarities with actual social phenomena; we call such mechanisms 'social reputation mechanisms'. The aim of this paper is to discuss several social phenomena and map these to existing social reputation mechanisms in a variety of scopes. First, we focus on reputation mechanisms in the individual scope, in which everyone is responsible for their own reputation. Subjective reputational values may be communicated to different entities in the form of recommendations. Secondly, we discuss social reputation mechanisms in the acquaintances scope, where one's reputation can be tied to another through vouching or invite-only networks. Finally, we present existing social reputation mechanisms in the neighbourhood scope. In such systems, one's reputation can heavily be affected by the behaviour of others in their neighbourhood or social group.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
TrustVault: A privacy-first data wallet for the European Blockchain Services Infrastructure
Authors:
Sharif Jacobino,
Johan Pouwelse
Abstract:
The European Union is on course to introduce a European Digital Identity that will be available to all EU citizens and businesses. This will have a huge impact on how citizens and businesses interact online. Big Tech companies currently dictate how digital identities are used. As a result, they have amassed vast amounts of private user data. Movements like Self-Sovereign Identity aim to give users…
▽ More
The European Union is on course to introduce a European Digital Identity that will be available to all EU citizens and businesses. This will have a huge impact on how citizens and businesses interact online. Big Tech companies currently dictate how digital identities are used. As a result, they have amassed vast amounts of private user data. Movements like Self-Sovereign Identity aim to give users control over their online identity. TrustVault is the first data wallet that gives users back control of their identity and all their data. TrustVault allows users to store all their data on their smartphones and control with whom they share it. The user has fine-grained access control based on verifiable user attributes. EBSI connects TrustVault to the European Self-Sovereign Identity Framework allowing users to use Verifiable Credentials from public and private institutions in their access control policies. The system is serverless and has no Trusted Third Parties. TrustVault replaces the for-profit infrastructure of Big Tech with a public and transparent platform for innovation.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Gromit: Benchmarking the Performance and Scalability of Blockchain Systems
Authors:
Bulat Nasrulin,
Martijn De Vos,
Georgy Ishmaev,
Johan Pouwelse
Abstract:
The growing number of implementations of blockchain systems stands in stark contrast with still limited research on a systematic comparison of performance characteristics of these solutions. Such research is crucial for evaluating fundamental trade-offs introduced by novel consensus protocols and their implementations. These performance limitations are commonly analyzed with ad-hoc benchmarking fr…
▽ More
The growing number of implementations of blockchain systems stands in stark contrast with still limited research on a systematic comparison of performance characteristics of these solutions. Such research is crucial for evaluating fundamental trade-offs introduced by novel consensus protocols and their implementations. These performance limitations are commonly analyzed with ad-hoc benchmarking frameworks focused on the consensus algorithm of blockchain systems. However, comparative evaluations of design choices require macro-benchmarks for uniform and comprehensive performance evaluations of blockchains at the system level rather than performance metrics of isolated components. To address this research gap, we implement Gromit, a generic framework for analyzing blockchain systems. Gromit treats each system under test as a transaction fabric where clients issue transactions to validators. We use Gromit to conduct the largest blockchain study to date, involving seven representative systems with varying consensus models. We determine the peak performance of these systems with a synthetic workload in terms of transaction throughput and scalability and show that transaction throughput does not scale with the number of validators. We explore how robust the subjected systems are against network delays and reveal that the performance of permissoned blockchain is highly sensitive to network conditions.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Distributed Attestation Revocation in Self-Sovereign Identity
Authors:
Rowdy Chotkan,
Jérémie Decouchant,
Johan Pouwelse
Abstract:
Self-Sovereign Identity (SSI) aspires to create a standardised identity layer for the Internet by placing citizens at the centre of their data, thereby weakening the grip of big tech on current digital identities. However, as millions of both physical and digital identities are lost annually, it is also necessary for SSIs to possibly be revoked to prevent misuse. Previous attempts at designing a r…
▽ More
Self-Sovereign Identity (SSI) aspires to create a standardised identity layer for the Internet by placing citizens at the centre of their data, thereby weakening the grip of big tech on current digital identities. However, as millions of both physical and digital identities are lost annually, it is also necessary for SSIs to possibly be revoked to prevent misuse. Previous attempts at designing a revocation mechanism typically violate the principles of SSI by relying on central trusted components. This lack of a distributed revocation mechanism hampers the development of SSI. In this paper, we address this limitation and present the first fully distributed SSI revocation mechanism that does not rely on specialised trusted nodes. Our novel gossip-based propagation algorithm disseminates revocations throughout the network and provides nodes with a proof of revocation that enables offline verification of revocations. We demonstrate through simulations that our protocol adequately scales to national levels.
△ Less
Submitted 12 August, 2022; v1 submitted 10 August, 2022;
originally announced August 2022.
-
MeritRank: Sybil Tolerant Reputation for Merit-based Tokenomics
Authors:
Bulat Nasrulin,
Georgy Ishmaev,
Johan Pouwelse
Abstract:
Decentralized reputation schemes present a promising area of experimentation in blockchain applications. These solutions aim to overcome the shortcomings of simple monetary incentive mechanisms of naive tokenomics. However, there is a significant research gap regarding the limitations and benefits of such solutions. We formulate these trade-offs as a conjecture on the irreconcilability of three de…
▽ More
Decentralized reputation schemes present a promising area of experimentation in blockchain applications. These solutions aim to overcome the shortcomings of simple monetary incentive mechanisms of naive tokenomics. However, there is a significant research gap regarding the limitations and benefits of such solutions. We formulate these trade-offs as a conjecture on the irreconcilability of three desirable properties of the reputation system in this context. Such a system can not be simultaneously generalizable, trustless, and Sybil resistant. To handle the limitations of this trilemma, we propose MeritRank: Sybil tolerant feedback aggregation mechanism for reputation. Instead of preventing Sybil attacks, our approach successfully bounds the benefits of these attacks. Using a dataset of participants' interactions in MakerDAO, we run experiments to demonstrate Sybil tolerance of MeritRank. Decay parameters of reputation in MeritRank: transitivity decay and connectivity decay, allow for a fine-tuning of desirable levels of reputation utility and Sybil tolerance in different use contexts.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
Double spending prevention of digital Euros using a web-of-trust
Authors:
Atanas Marinov,
Jurriaan Den Toonder,
Joep de Jong,
Pieter Tolsma,
Nils van den Honert,
Johan Pouwelse
Abstract:
In order to provide more security on double-spending, we have implemented a system allowing for a web-of-trust. In this paper, we explore different approaches taken against double-spending and implement our own version to avoid this within TrustChain as part of the ecosystem of EuroToken, the digital version of the euro. We have used the EVA protocol as a means to transfer data between users, buil…
▽ More
In order to provide more security on double-spending, we have implemented a system allowing for a web-of-trust. In this paper, we explore different approaches taken against double-spending and implement our own version to avoid this within TrustChain as part of the ecosystem of EuroToken, the digital version of the euro. We have used the EVA protocol as a means to transfer data between users, building on the existing functionality of transferring money between users. This allows the sender of EuroTokens to leave recommendations of users based on their previous interactions with other users. This dissemination of trust through the network allows users to make more trustworthy decisions. Although this provides an upgrade in terms of usability, the mathematical details of our implementation can be explored further in other research.
△ Less
Submitted 18 April, 2022; v1 submitted 14 April, 2022;
originally announced April 2022.
-
Web3: A Decentralized Societal Infrastructure for Identity, Trust, Money, and Data
Authors:
Joost Bambacht,
Johan Pouwelse
Abstract:
A movement for a more transparent and decentralized Internet is globally attracting more attention. People are becoming more privacy-aware of their online identities and data. The Internet is constantly evolving. Web2 focused on companies that provide services in exchange for personal user data. Web3 commits to user-centricity using decentralization and zero-server architectures. The current digit…
▽ More
A movement for a more transparent and decentralized Internet is globally attracting more attention. People are becoming more privacy-aware of their online identities and data. The Internet is constantly evolving. Web2 focused on companies that provide services in exchange for personal user data. Web3 commits to user-centricity using decentralization and zero-server architectures. The current digital society demands a global change to empower citizens and take back control. Citizens are locked into big-tech for personal data storage and their for-profit digital identity. Protection of data has proven to be essential, especially due to increased home Internet traffic during the COVID pandemic. Citizens do not possess their own travel documents. The European Commission aims to transition this governmental property towards self-sovereign identity, introducing many new opportunities. Citizens are locked into banks with non-portable IBAN accounts and unsustainable legacy banking infrastructures. Migration to all-digital low-fraud infrastructures and healthier competitive ecosystems is essential. The overall challenge is to return the power to citizens and users again. The transition to a more decentralized Internet is the first crucial step in the realization of user-centricity. This thesis presents the first exploratory study that integrates governmental-issued travel documents into a (decentralized) societal infrastructure. These self-sovereign identities form the authentic base to a private and secure transfer of money and data, and can effectively provide trust in authenticity that is currently missing in online conversations. A fully operational zero-server infrastructure that incorporates all our requirements has been developed for Android using the P2P network overlay IPv8, and a personalized blockchain called TrustChain...
△ Less
Submitted 3 March, 2022; v1 submitted 1 March, 2022;
originally announced March 2022.
-
Bristle: Decentralized Federated Learning in Byzantine, Non-i.i.d. Environments
Authors:
Joost Verbraeken,
Martijn de Vos,
Johan Pouwelse
Abstract:
Federated learning (FL) is a privacy-friendly type of machine learning where devices locally train a model on their private data and typically communicate model updates with a server. In decentralized FL (DFL), peers communicate model updates with each other instead. However, DFL is challenging since (1) the training data possessed by different peers is often non-i.i.d. (i.e., distributed differen…
▽ More
Federated learning (FL) is a privacy-friendly type of machine learning where devices locally train a model on their private data and typically communicate model updates with a server. In decentralized FL (DFL), peers communicate model updates with each other instead. However, DFL is challenging since (1) the training data possessed by different peers is often non-i.i.d. (i.e., distributed differently between the peers) and (2) malicious, or Byzantine, attackers can share arbitrary model updates with other peers to subvert the training process.
We address these two challenges and present Bristle, middleware between the learning application and the decentralized network layer. Bristle leverages transfer learning to predetermine and freeze the non-output layers of a neural network, significantly speeding up model training and lowering communication costs. To securely update the output layer with model updates from other peers, we design a fast distance-based prioritizer and a novel performance-based integrator. Their combined effect results in high resilience to Byzantine attackers and the ability to handle non-i.i.d. classes.
We empirically show that Bristle converges to a consistent 95% accuracy in Byzantine environments, outperforming all evaluated baselines. In non-Byzantine environments, Bristle requires 83% fewer iterations to achieve 90% accuracy compared to state-of-the-art methods. We show that when the training classes are non-i.i.d., Bristle significantly outperforms the accuracy of the most Byzantine-resilient baselines by 2.3x while reducing communication costs by 90%.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
ASTANA: Practical String Deobfuscation for Android Applications Using Program Slicing
Authors:
Martijn de Vos,
Johan Pouwelse
Abstract:
Software obfuscation is widely used by Android developers to protect the source code of their applications against adversarial reverse-engineering efforts. A specific type of obfuscation, string obfuscation, transforms the content of all string literals in the source code to non-interpretable text and inserts logic to deobfuscate these string literals at runtime. In this work, we demonstrate that…
▽ More
Software obfuscation is widely used by Android developers to protect the source code of their applications against adversarial reverse-engineering efforts. A specific type of obfuscation, string obfuscation, transforms the content of all string literals in the source code to non-interpretable text and inserts logic to deobfuscate these string literals at runtime. In this work, we demonstrate that string obfuscation is easily reversible. We present ASTANA, a practical tool for Android applications to recovers the human-readable content from obfuscated string literals. ASTANA makes minimal assumptions about the obfuscation logic or application structure. The key idea is to execute the deobfuscation logic for a specific (obfuscated) string literal, which yields the original string value. To obtain the relevant deobfuscation logic, we present a lightweight and optimistic algorithm, based on program slicing techniques. By an experimental evaluation with 100 popular real-world financial applications, we demonstrate the practicality of ASTANA. We verify the correctness of our deobfuscation tool and provide insights in the behaviour of string obfuscators applied by the developers of the evaluated Android applications.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
A Truly Self-Sovereign Identity System
Authors:
Quinten Stokkink,
Georgy Ishmaev,
Dick Epema,
Johan Pouwelse
Abstract:
Existing digital identity management systems fail to deliver the desirable properties of control by the users of their own identity data, credibility of disclosed identity data, and network-level anonymity. The recently proposed Self-Sovereign Identity (SSI) approach promises to give users these properties. However, we argue that without addressing privacy at the network level, SSI systems cannot…
▽ More
Existing digital identity management systems fail to deliver the desirable properties of control by the users of their own identity data, credibility of disclosed identity data, and network-level anonymity. The recently proposed Self-Sovereign Identity (SSI) approach promises to give users these properties. However, we argue that without addressing privacy at the network level, SSI systems cannot deliver on this promise. In this paper we present the design and analysis of our solution TCID, created in collaboration with the Dutch government. TCID is a system consisting of a set of components that together satisfy seven functional requirements to guarantee the desirable system properties. We show that the latency incurred by network-level anonymization in TCID is significantly larger than that of identity data disclosure protocols but is still low enough for practical situations. We conclude that current research on SSI is too narrowly focused on these data disclosure protocols.
△ Less
Submitted 28 September, 2021; v1 submitted 1 July, 2020;
originally announced July 2020.
-
XChange: A Blockchain-based Mechanism for Generic Asset Trading In Resource-constrained Environments
Authors:
Martijn de Vos,
Can Umut Ileri,
Johan Pouwelse
Abstract:
An increasing number of industries rely on Internet-of-Things devices to track physical resources. Blockchain technology provides primitives to represent these resources as digital assets on a secure distributed ledger. Due to the proliferation of blockchain-based assets, there is an increasing need for a generic mechanism to trade assets between isolated platforms. To date, there is no such mecha…
▽ More
An increasing number of industries rely on Internet-of-Things devices to track physical resources. Blockchain technology provides primitives to represent these resources as digital assets on a secure distributed ledger. Due to the proliferation of blockchain-based assets, there is an increasing need for a generic mechanism to trade assets between isolated platforms. To date, there is no such mechanism without reliance on a trusted third party.
In this work, we address this shortcoming and present XChange. Unlike existing approaches for decentralized asset trading, we decouple trade management and the actual exchange of assets. XChange mediates trade of any digital asset between isolated blockchain platforms while limiting the fraud conducted by adversarial parties. We first describe a generic, five-phase trading protocol that establishes and executes trade between individuals. This protocol accounts full trade specifications on a separate blockchain. We then devise a lightweight system architecture, composed of all required components for a generic asset marketplace.
We implement XChange and conduct real-world experimentation. We leverage an existing, lightweight blockchain, TrustChain, to account all orders and full trade specifications. By deploying XChange on multiple low-resource devices, we show that a full trade completes within half a second. To quantify the scalability of our mechanism, we conduct further experiments on our compute cluster. We conclude that the throughput of XChange, in terms of trades per second, scales linearly with the system load. Furthermore, we find that XChange exhibits superior throughput and order fulfil latency compared to related decentralized exchanges, BitShares and Waves.
△ Less
Submitted 10 April, 2020;
originally announced April 2020.
-
A Random Walk based Trust Ranking in Distributed Systems
Authors:
Alexander Stannat,
Johan Pouwelse
Abstract:
Honest cooperation among individuals in a network can be achieved in different ways. In online networks with some kind of central authority, such as Ebay, Airbnb, etc. honesty is achieved through a reputation system, which is maintained and secured by the central authority. These systems usually rely on review mechanisms, through which agents can evaluate the trustworthiness of their interaction p…
▽ More
Honest cooperation among individuals in a network can be achieved in different ways. In online networks with some kind of central authority, such as Ebay, Airbnb, etc. honesty is achieved through a reputation system, which is maintained and secured by the central authority. These systems usually rely on review mechanisms, through which agents can evaluate the trustworthiness of their interaction partners. These reviews are stored centrally and are tamper-proof. In decentralized peer-to-peer networks, enforcing cooperation turns out to be more difficult. One way of approaching this problem is by observing cooperative biological communities in nature. One finds that cooperation among biological organisms is achieved through a mechanism called indirect reciprocity. This mechanism for cooperation relies on some shared notion of trust. In this work we aim to facilitate communal cooperation in a peer-to-peer file sharing network called Tribler, by introducing a mechanism for evaluating the trustworthiness of agents. We determine a trust ranking of all nodes in the network based on the Monte Carlo algorithm estimating the values of Google's personalized PageRank vector. We go on to evaluate the algorithm's resistance to Sybil attacks, whereby our aim is for sybils to be assigned low trust scores.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
Deployment of a Blockchain-Based Self-Sovereign Identity
Authors:
Quinten Stokkink,
Johan Pouwelse
Abstract:
Digital identity is unsolved: after many years of research there is still no trusted communication over the Internet. To provide identity within the context of mutual distrust, this paper presents a blockchain-based digital identity solution. Without depending upon a single trusted third party, the proposed solution achieves passport-level legally valid identity. This solution for making identitie…
▽ More
Digital identity is unsolved: after many years of research there is still no trusted communication over the Internet. To provide identity within the context of mutual distrust, this paper presents a blockchain-based digital identity solution. Without depending upon a single trusted third party, the proposed solution achieves passport-level legally valid identity. This solution for making identities Self-Sovereign, builds on a generic provable claim model for which attestations of truth from third parties need to be collected. The claim model is then shown to be both blockchain structure and proof method agnostic. Four different implementations in support of these two claim model properties are shown to offer sub-second performance for claim creation and claim verification. Through the properties of Self-Sovereign Identity, legally valid status and acceptable performance, our solution is considered to be fit for adoption by the general public.
△ Less
Submitted 5 June, 2018;
originally announced June 2018.
-
Portable Trust: biometric-based authentication and blockchain storage for self-sovereign identity systems
Authors:
J. S. Hammudoglu,
J. Sparreboom,
J. I. Rauhamaa,
J. K. Faber,
L. C. Guerchi,
I. P. Samiotis,
S. P. Rao,
J. A. Pouwelse
Abstract:
We devised a mobile biometric-based authentication system only relying on local processing. Our Android open source solution explores the capability of current smartphones to acquire, process and match fingerprints using only its built-in hardware. Our architecture is specifically designed to run completely locally and autonomously, not requiring any cloud service, server, or permissioned access t…
▽ More
We devised a mobile biometric-based authentication system only relying on local processing. Our Android open source solution explores the capability of current smartphones to acquire, process and match fingerprints using only its built-in hardware. Our architecture is specifically designed to run completely locally and autonomously, not requiring any cloud service, server, or permissioned access to fingerprint reader hardware. It involves three main stages, starting with the fingerprint acquisition using the smartphone camera, followed by a processing pipeline to obtain minutiae features and a final step for matching against other locally stored fingerprints, based on Oriented FAST and Rotated BRIEF (ORB) descriptors. We obtained a mean matching accuracy of 55%, with the highest value of 67% for thumb fingers. Our ability to capture and process a finger fingerprint in mere seconds using a smartphone makes this work usable in a wide range of scenarios, for instance, offline remote regions. This work is specifically designed to be a key building block for a self-sovereign identity solution and integrate with our permissionless blockchain for identity and key attestation.
△ Less
Submitted 12 June, 2017;
originally announced June 2017.
-
Implicit Consensus: Blockchain with Unbounded Throughput
Authors:
Zhijie Ren,
Kelong Cong,
Johan Pouwelse,
Zekeriya Erkin
Abstract:
Recently, the blockchain technique was put in the spotlight as it introduced a systematic approach for multiple parties to reach consensus without needing trust. However, the application of this technique in practice is severely restricted due to its limitations in throughput. In this paper, we propose a novel consensus model, namely the implicit consensus, with a distinctive blockchain-based dist…
▽ More
Recently, the blockchain technique was put in the spotlight as it introduced a systematic approach for multiple parties to reach consensus without needing trust. However, the application of this technique in practice is severely restricted due to its limitations in throughput. In this paper, we propose a novel consensus model, namely the implicit consensus, with a distinctive blockchain-based distributed ledger in which each node holds its individual blockchain. In our system, the consensus is not on the transactions, but on a special type of blocks called Check Points that are used to validate individual transactions. Our system exploits the ideas of self-interest and spontaneous sharding and achieves unbounded throughput with the transaction reliability that equivalent to traditional Byzantine fault tolerance schemes.
△ Less
Submitted 14 July, 2017; v1 submitted 31 May, 2017;
originally announced May 2017.
-
Survey of robust and resilient social media tools on Android
Authors:
P. W. G. Brussee,
J. A. Pouwelse
Abstract:
We present an overview of robust and resilient social media tools to overcome natural disasters, censorship and Internet kill switches. These social media tools use Android devices to communicate during disasters and aim to overcome attacks on freedom of expression. There is an abundance of projects that aim to provide resilient communication, enhance privacy, and provide anonymity. We focus speci…
▽ More
We present an overview of robust and resilient social media tools to overcome natural disasters, censorship and Internet kill switches. These social media tools use Android devices to communicate during disasters and aim to overcome attacks on freedom of expression. There is an abundance of projects that aim to provide resilient communication, enhance privacy, and provide anonymity. We focus specifically on the limited set of mature tools with a healthy development community and Internet-deployment.
△ Less
Submitted 30 November, 2015;
originally announced December 2015.
-
Autonomous smartphone apps: self-compilation, mutation, and viral spreading
Authors:
Paul Brussee,
Johan Pouwelse
Abstract:
We present the first smart phone tool that is capable of self-compilation, mutation and viral spreading. Our autonomous app does not require a host computer to alter its functionality, change its appearance and lacks the normal necessity of a central app store to spread among hosts. We pioneered survival skills for mobile software in order to overcome disrupted Internet access due to natural disas…
▽ More
We present the first smart phone tool that is capable of self-compilation, mutation and viral spreading. Our autonomous app does not require a host computer to alter its functionality, change its appearance and lacks the normal necessity of a central app store to spread among hosts. We pioneered survival skills for mobile software in order to overcome disrupted Internet access due to natural disasters and human made interference, like Internet kill switches or censored networks. Internet kill switches have proven to be an effective tool to eradicate open Internet access and all forms of digital communication within an hour on a country-wide basis. We present the first operational tool that is capable of surviving such digital eradication.
△ Less
Submitted 4 November, 2015; v1 submitted 2 November, 2015;
originally announced November 2015.
-
A survey of P2P multidimensional indexing structures
Authors:
Ewout Bongers,
Johan Pouwelse
Abstract:
Traditional databases have long since reaped the benefits of multidimensional indexes. Numerous proposals in the literature describe multidimensional index designs for P2P systems. However, none of these designs have had real world implementations. Several proposals for P2P multidimensional indexes are reviewed and analyzed. Znet and VBI-tree are the most promising from a technical standpoint. All…
▽ More
Traditional databases have long since reaped the benefits of multidimensional indexes. Numerous proposals in the literature describe multidimensional index designs for P2P systems. However, none of these designs have had real world implementations. Several proposals for P2P multidimensional indexes are reviewed and analyzed. Znet and VBI-tree are the most promising from a technical standpoint. All of the proposed designs assume honest nodes and are thus open to abuse. This is a critical flaw that must be solved before any of the proposed systems can be used.
△ Less
Submitted 20 July, 2015;
originally announced July 2015.
-
Performance analysis of a Tor-like onion routing implementation
Authors:
Quinten Stokkink,
Harmjan Treep,
Johan Pouwelse
Abstract:
The current onion routing implementation of Tribler works as expected but throttles the overall throughput of the Tribler system. This article discusses a measuring procedure to reproducibly profile the tunnel implementation so further optimizations of the tunnel community can be made. Our work has been integrated into the Tribler eco-system.
The current onion routing implementation of Tribler works as expected but throttles the overall throughput of the Tribler system. This article discusses a measuring procedure to reproducibly profile the tunnel implementation so further optimizations of the tunnel community can be made. Our work has been integrated into the Tribler eco-system.
△ Less
Submitted 1 July, 2015;
originally announced July 2015.
-
Anonymous online purchases with exhaustive operational security
Authors:
Vincent Van Mieghem,
Johan Pouwelse
Abstract:
This paper describes the process of remaining anonymous online and its concurrent operational security that has to be performed. It focusses particularly on remaining anonymous while purchasing online goods, resulting in anonymously bought items. Different aspects of the operational security process as well as anonymously funding with cryptocurrencies are described. Eventually it is shown how to a…
▽ More
This paper describes the process of remaining anonymous online and its concurrent operational security that has to be performed. It focusses particularly on remaining anonymous while purchasing online goods, resulting in anonymously bought items. Different aspects of the operational security process as well as anonymously funding with cryptocurrencies are described. Eventually it is shown how to anonymously purchase items and services from the hidden web, as well as the delivery. It is shown that, while becoming increasingly difficult, it is still possible to make anonymous purchases. Our presented work combines existing best-practices and deliberately avoids untested novel approaches when possible.
△ Less
Submitted 27 May, 2015;
originally announced May 2015.
-
A Self-Compiling Android Data Obfuscation Tool
Authors:
Olivier Hokke,
Alex Kolpa,
Joris van den Oever,
Alex Walterbos,
Johan Pouwelse
Abstract:
Smartphones are becoming more significant in storing and transferring data. However, techniques ensuring this data is not compromised after a confiscation of the device are not readily available. DroidStealth is an open source Android application which combines data encryption and application obfuscation techniques to provide users with a way to securely hide content on their smartphones. This inc…
▽ More
Smartphones are becoming more significant in storing and transferring data. However, techniques ensuring this data is not compromised after a confiscation of the device are not readily available. DroidStealth is an open source Android application which combines data encryption and application obfuscation techniques to provide users with a way to securely hide content on their smartphones. This includes hiding the application's default launch methods and providing methods such as dial-to-launch or invisible launch buttons. A novel technique provided by DroidStealth is the ability to transform its appearance to be able to hide in plain sight on devices. To achieve this, it uses self-compilation, without requiring any special permissions. This Two-Layer protection aims to protect the user and its data from casual search in various situations.
△ Less
Submitted 9 February, 2015; v1 submitted 5 February, 2015;
originally announced February 2015.
-
Operational Distributed Regulation for Bitcoin
Authors:
Dinesh,
Erlich,
Gilfoyle,
Jared,
Richard,
Johan Pouwelse
Abstract:
On February 2014, $650.000.000 worth of Bitcoins disappeared. Currently it is unclear whether hackers or MtGox, the largest Bitcoin exchange, are to be blamed. In either case, the anonymous and unregulated nature of the Bitcoin system makes it practically impossible for innocent victims to get their money back. We have investigated the technical possibilities, solutions and implications of introdu…
▽ More
On February 2014, $650.000.000 worth of Bitcoins disappeared. Currently it is unclear whether hackers or MtGox, the largest Bitcoin exchange, are to be blamed. In either case, the anonymous and unregulated nature of the Bitcoin system makes it practically impossible for innocent victims to get their money back. We have investigated the technical possibilities, solutions and implications of introducing a regulatory framework based on redlisting Bitcoin accounts. Despite numerous proposals, the Bitcoin community has voiced a strong opinion against any form of regulation. However, most of the discussions were based on speculations rather than facts. We strive to contribute a scientific foundation to these discussions and illuminate the path to crypto-justice.
△ Less
Submitted 20 June, 2014;
originally announced June 2014.
-
The fifteen year struggle of decentralizing privacy-enhancing technology
Authors:
Rolf Jagerman,
Wendo Sabée,
Laurens Versluis,
Martijn de Vos,
Johan Pouwelse
Abstract:
Ever since the introduction of the internet, it has been void of any privacy. The majority of internet traffic currently is and always has been unencrypted. A number of anonymous communication overlay networks exist whose aim it is to provide privacy to its users. However, due to the nature of the internet, there is major difficulty in getting these networks to become both decentralized and anonym…
▽ More
Ever since the introduction of the internet, it has been void of any privacy. The majority of internet traffic currently is and always has been unencrypted. A number of anonymous communication overlay networks exist whose aim it is to provide privacy to its users. However, due to the nature of the internet, there is major difficulty in getting these networks to become both decentralized and anonymous. We list reasons for having anonymous networks, discern the problems in achieving decentralization and sum up the biggest initiatives in the field and their current status. To do so, we use one exemplary network, the Tor network. We explain how Tor works, what vulnerabilities this network currently has, and possible attacks that could be used to violate privacy and anonymity. The Tor network is used as a key comparison network in the main part of the report: a tabular overview of the major anonymous networking technologies in use today.
△ Less
Submitted 18 April, 2014;
originally announced April 2014.