-
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
Authors:
Benjamin Estermann,
Luca A. Lanzendörfer,
Yannick Niedermayr,
Roger Wattenhofer
Abstract:
Algorithmic reasoning is a fundamental cognitive ability that plays a pivotal role in problem-solving and decision-making processes. Reinforcement Learning (RL) has demonstrated remarkable proficiency in tasks such as motor control, handling perceptual input, and managing stochastic environments. These advancements have been enabled in part by the availability of benchmarks. In this work we introd…
▽ More
Algorithmic reasoning is a fundamental cognitive ability that plays a pivotal role in problem-solving and decision-making processes. Reinforcement Learning (RL) has demonstrated remarkable proficiency in tasks such as motor control, handling perceptual input, and managing stochastic environments. These advancements have been enabled in part by the availability of benchmarks. In this work we introduce PUZZLES, a benchmark based on Simon Tatham's Portable Puzzle Collection, aimed at fostering progress in algorithmic and logical reasoning in RL. PUZZLES contains 40 diverse logic puzzles of adjustable sizes and varying levels of complexity; many puzzles also feature a diverse set of additional configuration parameters. The 40 puzzles provide detailed information on the strengths and generalization capabilities of RL agents. Furthermore, we evaluate various RL algorithms on PUZZLES, providing baseline comparisons and demonstrating the potential for future research. All the software, including the environment, is available at https://github.com/ETH-DISCO/rlp.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Towards Learning Abductive Reasoning using VSA Distributed Representations
Authors:
Giacomo Camposampiero,
Michael Hersche,
Aleksandar Terzić,
Roger Wattenhofer,
Abu Sebastian,
Abbas Rahimi
Abstract:
We introduce the Abductive Rule Learner with Context-awareness (ARLC), a model that solves abstract reasoning tasks based on Learn-VRF. ARLC features a novel and more broadly applicable training objective for abductive reasoning, resulting in better interpretability and higher accuracy when solving Raven's progressive matrices (RPM). ARLC allows both programming domain knowledge and learning the r…
▽ More
We introduce the Abductive Rule Learner with Context-awareness (ARLC), a model that solves abstract reasoning tasks based on Learn-VRF. ARLC features a novel and more broadly applicable training objective for abductive reasoning, resulting in better interpretability and higher accuracy when solving Raven's progressive matrices (RPM). ARLC allows both programming domain knowledge and learning the rules underlying a data distribution. We evaluate ARLC on the I-RAVEN dataset, showcasing state-of-the-art accuracy across both in-distribution and out-of-distribution (unseen attribute-rule pairs) tests. ARLC surpasses neuro-symbolic and connectionist baselines, including large language models, despite having orders of magnitude fewer parameters. We show ARLC's robustness to post-programming training by incrementally learning from examples on top of programmed knowledge, which only improves its performance and does not result in catastrophic forgetting of the programmed solution. We validate ARLC's seamless transfer learning from a 2x2 RPM constellation to unseen constellations. Our code is available at https://github.com/IBM/abductive-rule-learner-with-context-awareness.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Next Level Message-Passing with Hierarchical Support Graphs
Authors:
Carlos Vonessen,
Florian Grötschla,
Roger Wattenhofer
Abstract:
Message-Passing Neural Networks (MPNNs) are extensively employed in graph learning tasks but suffer from limitations such as the restricted scope of information exchange, by being confined to neighboring nodes during each round of message passing. Various strategies have been proposed to address these limitations, including incorporating virtual nodes to facilitate global information exchange. In…
▽ More
Message-Passing Neural Networks (MPNNs) are extensively employed in graph learning tasks but suffer from limitations such as the restricted scope of information exchange, by being confined to neighboring nodes during each round of message passing. Various strategies have been proposed to address these limitations, including incorporating virtual nodes to facilitate global information exchange. In this study, we introduce the Hierarchical Support Graph (HSG), an extension of the virtual node concept created through recursive coarsening of the original graph. This approach provides a flexible framework for enhancing information flow in graphs, independent of the specific MPNN layers utilized. We present a theoretical analysis of HSGs, investigate their empirical performance, and demonstrate that HSGs can surpass other methods augmented with virtual nodes, achieving state-of-the-art results across multiple datasets.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
SoK: Attacks on DAOs
Authors:
Rainer Feichtinger,
Robin Fritsch,
Lioba Heimbach,
Yann Vonlanthen,
Roger Wattenhofer
Abstract:
Decentralized Autonomous Organizations (DAOs) are blockchain-based organizations that facilitate decentralized governance. Today, DAOs not only hold billions of dollars in their treasury but also govern many of the most popular Decentralized Finance (DeFi) protocols. This paper systematically analyses security threats to DAOs, focusing on the types of attacks they face. We study attacks on DAOs th…
▽ More
Decentralized Autonomous Organizations (DAOs) are blockchain-based organizations that facilitate decentralized governance. Today, DAOs not only hold billions of dollars in their treasury but also govern many of the most popular Decentralized Finance (DeFi) protocols. This paper systematically analyses security threats to DAOs, focusing on the types of attacks they face. We study attacks on DAOs that took place in the past, attacks that have been theorized to be possible, and potential attacks that were uncovered and prevented in audits. For each of these (potential) attacks, we describe and categorize the attack vectors utilized into four categories. This reveals that while many attacks on DAOs take advantage of the less tangible and more complex human nature involved in governance, audits tend to focus on code and protocol vulnerabilities. Thus, additionally, the paper examines empirical data on DAO vulnerabilities, outlines risk factors contributing to these attacks, and suggests mitigation strategies to safeguard against such vulnerabilities.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
An LLM-based Recommender System Environment
Authors:
Nathan Corecco,
Giorgio Piatti,
Luca A. Lanzendörfer,
Flint Xiaofeng Fan,
Roger Wattenhofer
Abstract:
Reinforcement learning (RL) has gained popularity in the realm of recommender systems due to its ability to optimize long-term rewards and guide users in discovering relevant content. However, the successful implementation of RL in recommender systems is challenging because of several factors, including the limited availability of online data for training on-policy methods. This scarcity requires…
▽ More
Reinforcement learning (RL) has gained popularity in the realm of recommender systems due to its ability to optimize long-term rewards and guide users in discovering relevant content. However, the successful implementation of RL in recommender systems is challenging because of several factors, including the limited availability of online data for training on-policy methods. This scarcity requires expensive human interaction for online model training. Furthermore, the development of effective evaluation frameworks that accurately reflect the quality of models remains a fundamental challenge in recommender systems. To address these challenges, we propose a comprehensive framework for synthetic environments that simulate human behavior by harnessing the capabilities of large language models (LLMs). We complement our framework with in-depth ablation studies and demonstrate its effectiveness with experiments on movie and book recommendations. By utilizing LLMs as synthetic users, this work introduces a modular and novel framework for training RL-based recommender systems. The software, including the RL environment, is publicly available.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Unifying Partial Synchrony
Authors:
Andrei Constantinescu,
Diana Ghinea,
Jakub Sliwinski,
Roger Wattenhofer
Abstract:
The distributed computing literature considers multiple options for modeling communication. Most simply, communication is categorized as either synchronous or asynchronous. Synchronous communication assumes that messages get delivered within a publicly known timeframe and that parties' clocks are synchronized. Asynchronous communication, on the other hand, only assumes that messages get delivered…
▽ More
The distributed computing literature considers multiple options for modeling communication. Most simply, communication is categorized as either synchronous or asynchronous. Synchronous communication assumes that messages get delivered within a publicly known timeframe and that parties' clocks are synchronized. Asynchronous communication, on the other hand, only assumes that messages get delivered eventually. A more nuanced approach, or a middle ground between the two extremes, is given by the partially synchronous model, which is arguably the most realistic option. This model comes in two commonly considered flavors:
(i) The Global Stabilization Time (GST) model: after an (unknown) amount of time, the network becomes synchronous. This captures scenarios where network issues are transient.
(ii) The Unknown Latency (UL) model: the network is, in fact, synchronous, but the message delay bound is unknown.
This work formally establishes that any time-agnostic property that can be achieved by a protocol in the UL model can also be achieved by a (possibly different) protocol in the GST model. By time-agnostic, we mean properties that can depend on the order in which events happen but not on time as measured by the parties. Most properties considered in distributed computing are time-agnostic. The converse was already known, even without the time-agnostic requirement, so our result shows that the two network conditions are, under one sensible assumption, equally demanding.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Assessing Adversarial Robustness of Large Language Models: An Empirical Study
Authors:
Zeyu Yang,
Zhao Meng,
Xiaochen Zheng,
Roger Wattenhofer
Abstract:
Large Language Models (LLMs) have revolutionized natural language processing, but their robustness against adversarial attacks remains a critical concern. We presents a novel white-box style attack approach that exposes vulnerabilities in leading open-source LLMs, including Llama, OPT, and T5. We assess the impact of model size, structure, and fine-tuning strategies on their resistance to adversar…
▽ More
Large Language Models (LLMs) have revolutionized natural language processing, but their robustness against adversarial attacks remains a critical concern. We presents a novel white-box style attack approach that exposes vulnerabilities in leading open-source LLMs, including Llama, OPT, and T5. We assess the impact of model size, structure, and fine-tuning strategies on their resistance to adversarial perturbations. Our comprehensive evaluation across five diverse text classification tasks establishes a new benchmark for LLM robustness. The findings of this study have far-reaching implications for the reliable deployment of LLMs in real-world applications and contribute to the advancement of trustworthy AI systems.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Hunting DeFi Vulnerabilities via Context-Sensitive Concolic Verification
Authors:
Yepeng Ding,
Arthur Gervais,
Roger Wattenhofer,
Hiroyuki Sato
Abstract:
Decentralized finance (DeFi) is revolutionizing the traditional centralized finance paradigm with its attractive features such as high availability, transparency, and tamper-proofing. However, attacks targeting DeFi services have severely damaged the DeFi market, as evidenced by our investigation of 80 real-world DeFi incidents from 2017 to 2022. Existing methods, based on symbolic execution, mode…
▽ More
Decentralized finance (DeFi) is revolutionizing the traditional centralized finance paradigm with its attractive features such as high availability, transparency, and tamper-proofing. However, attacks targeting DeFi services have severely damaged the DeFi market, as evidenced by our investigation of 80 real-world DeFi incidents from 2017 to 2022. Existing methods, based on symbolic execution, model checking, semantic analysis, and fuzzing, fall short in identifying the most DeFi vulnerability types. To address the deficiency, we propose Context-Sensitive Concolic Verification (CSCV), a method of automating the DeFi vulnerability finding based on user-defined properties formulated in temporal logic. CSCV builds and optimizes contexts to guide verification processes that dynamically construct context-carrying transition systems in tandem with concolic executions. Furthermore, we demonstrate the effectiveness of CSCV through experiments on real-world DeFi services and qualitative comparison. The experiment results show that our CSCV prototype successfully detects 76.25% of the vulnerabilities from the investigated incidents with an average time of 253.06 seconds.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening
Authors:
Hei Yi Mak,
Flint Xiaofeng Fan,
Luca A. Lanzendörfer,
Cheston Tan,
Wei Tsang Ooi,
Roger Wattenhofer
Abstract:
In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converg…
▽ More
In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converge to diverse optimal value functions. To address this problem, we introduce the Convergence-AwarE SAmpling with scReening (CAESAR) aggregation scheme designed to enhance the learning of individual agents across varied MDPs. CAESAR is an aggregation strategy used by the server that combines convergence-aware sampling with a screening mechanism. By exploiting the fact that agents learning in identical MDPs are converging to the same optimal value function, CAESAR enables the selective assimilation of knowledge from more proficient counterparts, thereby significantly enhancing the overall learning efficiency. We empirically validate our hypothesis and demonstrate the effectiveness of CAESAR in enhancing the learning efficiency of agents, using both a custom-built GridWorld environment and the classical FrozenLake-v1 task, each presenting varying levels of environmental heterogeneity.
△ Less
Submitted 16 April, 2024; v1 submitted 29 March, 2024;
originally announced March 2024.
-
SUPClust: Active Learning at the Boundaries
Authors:
Yuta Ono,
Till Aczel,
Benjamin Estermann,
Roger Wattenhofer
Abstract:
Active learning is a machine learning paradigm designed to optimize model performance in a setting where labeled data is expensive to acquire. In this work, we propose a novel active learning method called SUPClust that seeks to identify points at the decision boundary between classes. By targeting these points, SUPClust aims to gather information that is most informative for refining the model's…
▽ More
Active learning is a machine learning paradigm designed to optimize model performance in a setting where labeled data is expensive to acquire. In this work, we propose a novel active learning method called SUPClust that seeks to identify points at the decision boundary between classes. By targeting these points, SUPClust aims to gather information that is most informative for refining the model's prediction of complex decision regions. We demonstrate experimentally that labeling these points leads to strong model performance. This improvement is observed even in scenarios characterized by strong class imbalance.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Bridging Diversity and Uncertainty in Active learning with Self-Supervised Pre-Training
Authors:
Paul Doucet,
Benjamin Estermann,
Till Aczel,
Roger Wattenhofer
Abstract:
This study addresses the integration of diversity-based and uncertainty-based sampling strategies in active learning, particularly within the context of self-supervised pre-trained models. We introduce a straightforward heuristic called TCM that mitigates the cold start problem while maintaining strong performance across various data levels. By initially applying TypiClust for diversity sampling a…
▽ More
This study addresses the integration of diversity-based and uncertainty-based sampling strategies in active learning, particularly within the context of self-supervised pre-trained models. We introduce a straightforward heuristic called TCM that mitigates the cold start problem while maintaining strong performance across various data levels. By initially applying TypiClust for diversity sampling and subsequently transitioning to uncertainty sampling with Margin, our approach effectively combines the strengths of both strategies. Our experiments demonstrate that TCM consistently outperforms existing methods across various datasets in both low and high data regimes.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
CoRe-GD: A Hierarchical Framework for Scalable Graph Visualization with GNNs
Authors:
Florian Grötschla,
Joël Mathys,
Robert Veres,
Roger Wattenhofer
Abstract:
Graph Visualization, also known as Graph Drawing, aims to find geometric embeddings of graphs that optimize certain criteria. Stress is a widely used metric; stress is minimized when every pair of nodes is positioned at their shortest path distance. However, stress optimization presents computational challenges due to its inherent complexity and is usually solved using heuristics in practice. We i…
▽ More
Graph Visualization, also known as Graph Drawing, aims to find geometric embeddings of graphs that optimize certain criteria. Stress is a widely used metric; stress is minimized when every pair of nodes is positioned at their shortest path distance. However, stress optimization presents computational challenges due to its inherent complexity and is usually solved using heuristics in practice. We introduce a scalable Graph Neural Network (GNN) based Graph Drawing framework with sub-quadratic runtime that can learn to optimize stress. Inspired by classical stress optimization techniques and force-directed layout algorithms, we create a coarsening hierarchy for the input graph. Beginning at the coarsest level, we iteratively refine and un-coarsen the layout, until we generate an embedding for the original graph. To enhance information propagation within the network, we propose a novel positional rewiring technique based on intermediate node positions. Our empirical evaluation demonstrates that the framework achieves state-of-the-art performance while remaining scalable.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
Authors:
Philip Jordan,
Florian Grötschla,
Flint Xiaofeng Fan,
Roger Wattenhofer
Abstract:
In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide…
▽ More
In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results represent the first sample complexity analysis for Byzantine fault-tolerant decentralized federated non-convex optimization, our technical contributions may be of independent interest. Finally, we corroborate our theoretical results experimentally for common RL environments, demonstrating the speed-up of decentralized federations w.r.t. the number of participating agents and resilience against various Byzantine attacks.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Efficient and Scalable Graph Generation through Iterative Local Expansion
Authors:
Andreas Bergmeister,
Karolis Martinkus,
Nathanaël Perraudin,
Roger Wattenhofer
Abstract:
In the realm of generative models for graphs, extensive research has been conducted. However, most existing methods struggle with large graphs due to the complexity of representing the entire joint distribution across all node pairs and capturing both global and local graph structures simultaneously. To overcome these issues, we introduce a method that generates a graph by progressively expanding…
▽ More
In the realm of generative models for graphs, extensive research has been conducted. However, most existing methods struggle with large graphs due to the complexity of representing the entire joint distribution across all node pairs and capturing both global and local graph structures simultaneously. To overcome these issues, we introduce a method that generates a graph by progressively expanding a single node to a target graph. In each step, nodes and edges are added in a localized manner through denoising diffusion, building first the global structure, and then refining the local details. The local generation avoids modeling the entire joint distribution over all node pairs, achieving substantial computational savings with subquadratic runtime relative to node count while maintaining high expressivity through multiscale generation. Our experiments show that our model achieves state-of-the-art performance on well-established benchmark datasets while successfully scaling to graphs with at least 5000 nodes. Our method is also the first to successfully extrapolate to graphs outside of the training distribution, showcasing a much better generalization capability over existing methods.
△ Less
Submitted 14 May, 2024; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Dissecting the EIP-2930 Optional Access Lists
Authors:
Lioba Heimbach,
Quentin Kniep,
Yann Vonlanthen,
Roger Wattenhofer,
Patrick Züst
Abstract:
Ethereum introduced Transaction Access Lists (TALs) in 2020 to optimize gas costs during transaction execution. In this work, we present a comprehensive analysis of TALs in Ethereum, focusing on adoption, quality, and gas savings. Analyzing a full month of mainnet data with 31,954,474 transactions, we found that only 1.46% of transactions included a TAL, even though 42.6% of transactions would hav…
▽ More
Ethereum introduced Transaction Access Lists (TALs) in 2020 to optimize gas costs during transaction execution. In this work, we present a comprehensive analysis of TALs in Ethereum, focusing on adoption, quality, and gas savings. Analyzing a full month of mainnet data with 31,954,474 transactions, we found that only 1.46% of transactions included a TAL, even though 42.6% of transactions would have benefited from it. On average, access lists can save around 0.29% of gas costs, equivalent to approximately 3,450 ETH (roughly US$ 5 Mio) per year. However, 19.6% of TALs included by transactions contained imperfections, causing almost 11.8% of transactions to pay more gas with TAL than without. We find that these inaccuracies are caused by the unknown state at the time of the TAL computation as well as imperfect TAL computations provided by all major Ethereum clients. We thus compare the gas savings when calculating the TAL at the beginning of the block vs. calculating it on the correct state, to find that the unknown state is a major source of TAL inaccuracies. Finally, we implement an ideal TAL computation for the Erigon client to highlight the cost of these flawed implementations.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Fast Internet Computer Consensus
Authors:
Massimo Albarello,
Jakub Sliwinski,
Yann Vonlanthen,
Roger Wattenhofer
Abstract:
This paper presents the first rotating leader state machine replication (SMR) protocol that allows transactions to be confirmed in just a single round-trip time in the Byzantine fault tolerance (BFT) setting. Based on minimal alterations to the Internet Computer Consensus (ICC) protocol and with negligible communication overhead, we introduce a novel dual mode mechanism that enables optimal block…
▽ More
This paper presents the first rotating leader state machine replication (SMR) protocol that allows transactions to be confirmed in just a single round-trip time in the Byzantine fault tolerance (BFT) setting. Based on minimal alterations to the Internet Computer Consensus (ICC) protocol and with negligible communication overhead, we introduce a novel dual mode mechanism that enables optimal block finalization latency in the fast path. Crucially, the modes of operation are integrated, such that even if the fast path is not effective, no penalties are incurred. Moreover, our algorithm maintains the core attributes of the original ICC protocol, including optimistic responsiveness and rotating leaders without the necessity for a view-change protocol.
We prove the correctness of our Fast Internet Computer Consensus (FICC) protocol and provide an open-source implementation of it. Both the FICC and original ICC protocol are compared in a globally distributed wide-area network. Our evaluation reveals that the FICC protocol achieves reduced latency compared to the ICC protocol, without requiring additional security assumptions. Furthermore, by increasing the number of replicas to $n = 5f + 1$, we exhibit that latency improvements close to the theoretical maximum of 33% are attainable. We conclude by highlighting the network topology as a significant factor in evaluating and comparing the latency of consensus algorithms.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Exponentially Faster Language Modelling
Authors:
Peter Belcak,
Roger Wattenhofer
Abstract:
Language models only really need to use an exponential fraction of their neurons for individual inferences. As proof, we present UltraFastBERT, a BERT variant that uses 0.3% of its neurons during inference while performing on par with similar BERT models. UltraFastBERT selectively engages just 12 out of 4095 neurons for each layer inference. This is achieved by replacing feedforward networks with…
▽ More
Language models only really need to use an exponential fraction of their neurons for individual inferences. As proof, we present UltraFastBERT, a BERT variant that uses 0.3% of its neurons during inference while performing on par with similar BERT models. UltraFastBERT selectively engages just 12 out of 4095 neurons for each layer inference. This is achieved by replacing feedforward networks with fast feedforward networks (FFFs). While no truly efficient implementation currently exists to unlock the full acceleration potential of conditional neural execution, we provide high-level CPU code achieving 78x speedup over the optimized baseline feedforward implementation, and a PyTorch implementation delivering 40x speedup over the equivalent batched feedforward inference. We publish our training code, benchmarking setup, and model weights.
△ Less
Submitted 21 November, 2023; v1 submitted 15 November, 2023;
originally announced November 2023.
-
SURF: A Generalization Benchmark for GNNs Predicting Fluid Dynamics
Authors:
Stefan Künzli,
Florian Grötschla,
Joël Mathys,
Roger Wattenhofer
Abstract:
Simulating fluid dynamics is crucial for the design and development process, ranging from simple valves to complex turbomachinery. Accurately solving the underlying physical equations is computationally expensive. Therefore, learning-based solvers that model interactions on meshes have gained interest due to their promising speed-ups. However, it is unknown to what extent these models truly unders…
▽ More
Simulating fluid dynamics is crucial for the design and development process, ranging from simple valves to complex turbomachinery. Accurately solving the underlying physical equations is computationally expensive. Therefore, learning-based solvers that model interactions on meshes have gained interest due to their promising speed-ups. However, it is unknown to what extent these models truly understand the underlying physical principles and can generalize rather than interpolate. Generalization is a key requirement for a general-purpose fluid simulator, which should adapt to different topologies, resolutions, or thermodynamic ranges. We propose SURF, a benchmark designed to test the $\textit{generalization}$ of learned graph-based fluid simulators. SURF comprises individual datasets and provides specific performance and generalization metrics for evaluating and comparing different models. We empirically demonstrate the applicability of SURF by thoroughly investigating the two state-of-the-art graph-based models, yielding new insights into their generalization.
△ Less
Submitted 20 November, 2023; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Flood and Echo Net: Algorithmically Aligned GNNs that Generalize
Authors:
Joël Mathys,
Florian Grötschla,
Kalyan Varma Nadimpalli,
Roger Wattenhofer
Abstract:
Most Graph Neural Networks follow the standard message-passing framework where, in each step, all nodes simultaneously communicate with each other. We want to challenge this paradigm by aligning the computation more closely to the execution of distributed algorithms and propose the Flood and Echo Net. A single round of a Flood and Echo Net consists of an origin node and a flooding phase followed b…
▽ More
Most Graph Neural Networks follow the standard message-passing framework where, in each step, all nodes simultaneously communicate with each other. We want to challenge this paradigm by aligning the computation more closely to the execution of distributed algorithms and propose the Flood and Echo Net. A single round of a Flood and Echo Net consists of an origin node and a flooding phase followed by an echo phase. First, during the flooding, messages are sent from the origin and propagated outwards throughout the entire graph. Then, during the echo, the message flow reverses and messages are sent back towards the origin. As nodes are only sparsely activated upon receiving a message, this leads to a wave-like activation pattern that traverses the graph. Through these sparse but parallel activations, the Net becomes more expressive than traditional MPNNs which are limited by the 1-WL test and also is provably more efficient in terms of message complexity. Moreover, the mechanism's inherent ability to generalize across graphs of varying sizes positions it as a practical architecture for the task of algorithmic learning. We test the Flood and Echo Net on a variety of synthetic tasks and the SALSA-CLRS benchmark and find that the algorithmic alignment of the execution improves generalization to larger graph sizes.
△ Less
Submitted 3 June, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Recovering Single-Crossing Preferences From Approval Ballots
Authors:
Andrei Constantinescu,
Roger Wattenhofer
Abstract:
An electorate with fully-ranked innate preferences casts approval votes over a finite set of alternatives. As a result, only partial information about the true preferences is revealed to the voting authorities. In an effort to understand the nature of the true preferences given only partial information, one might ask whether the unknown innate preferences could possibly be single-crossing. The exi…
▽ More
An electorate with fully-ranked innate preferences casts approval votes over a finite set of alternatives. As a result, only partial information about the true preferences is revealed to the voting authorities. In an effort to understand the nature of the true preferences given only partial information, one might ask whether the unknown innate preferences could possibly be single-crossing. The existence of a polynomial time algorithm to determine this has been asked as an outstanding problem in the works of Elkind and Lackner. We hereby give a polynomial time algorithm determining a single-crossing collection of fully-ranked preferences that could have induced the elicited approval ballots, or reporting the nonexistence thereof. Moreover, we consider the problem of identifying negative instances with a set of forbidden sub-ballots, showing that any such characterization requires infinitely many forbidden configurations.
△ Less
Submitted 6 October, 2023; v1 submitted 5 October, 2023;
originally announced October 2023.
-
What Determines the Price of NFTs?
Authors:
Vivian Ziemke,
Benjamin Estermann,
Roger Wattenhofer,
Ye Wang
Abstract:
In the evolving landscape of digital art, Non-Fungible Tokens (NFTs) have emerged as a groundbreaking platform, bridging the realms of art and technology. NFTs serve as the foundational framework that has revolutionized the market for digital art, enabling artists to showcase and monetize their creations in unprecedented ways. NFTs combine metadata stored on the blockchain with off-chain data, suc…
▽ More
In the evolving landscape of digital art, Non-Fungible Tokens (NFTs) have emerged as a groundbreaking platform, bridging the realms of art and technology. NFTs serve as the foundational framework that has revolutionized the market for digital art, enabling artists to showcase and monetize their creations in unprecedented ways. NFTs combine metadata stored on the blockchain with off-chain data, such as images, to create a novel form of digital ownership. It is not fully understood how these factors come together to determine NFT prices. In this study, we analyze both on-chain and off-chain data of NFT collections trading on OpenSea to understand what influences NFT pricing. Our results show that while text and image data of the NFTs can be used to explain price variations within collections, the extracted features do not generalize to new, unseen collections. Furthermore, we find that an NFT collection's trading volume often relates to its online presence, like social media followers and website traffic.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
The PoW Landscape in the Aftermath of The Merge
Authors:
Lucianna Kiffer,
Sophia Skorik,
Yann Vonlanthen,
Roger Wattenhofer
Abstract:
On 15th September 2022, The Merge marked the Ethereum network's transition from computation-hardness-based consensus (proof-of-work) to a committee-based consensus mechanism (proof-of-stake). As a result, all the specialized hardware and GPUs that were being used by miners ceased to be profitable in the main Ethereum network. Miners were then left with the decision of how to re-purpose their hardw…
▽ More
On 15th September 2022, The Merge marked the Ethereum network's transition from computation-hardness-based consensus (proof-of-work) to a committee-based consensus mechanism (proof-of-stake). As a result, all the specialized hardware and GPUs that were being used by miners ceased to be profitable in the main Ethereum network. Miners were then left with the decision of how to re-purpose their hardware. One such choice was to try and make a profit mining another existing PoW system. In this study, we explore this choice by analyzing the hashrate increase in the top PoW networks following the merge. Our findings reveal that the peak increase in hashrate to other PoW networks following The Merge represents an adoption of at least 41% of the hashrate that was present in Ethereum, with 12% remaining more than 5 months later. Though we measure a drastic decrease in profitability by almost an order of magnitude, the continued presence of miners halts claims that power consumption was instantly addressed by Ethereum's switch to PoS.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning
Authors:
Julian Minder,
Florian Grötschla,
Joël Mathys,
Roger Wattenhofer
Abstract:
We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations. Many algorithms in CLRS require global memory or information exchange, mirrored in its execution model, which constructs fully connected (not sparse) graphs based on the underlying problem. Despite CLRS's aim of assessing how effectively learned algorithms c…
▽ More
We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations. Many algorithms in CLRS require global memory or information exchange, mirrored in its execution model, which constructs fully connected (not sparse) graphs based on the underlying problem. Despite CLRS's aim of assessing how effectively learned algorithms can generalize to larger instances, the existing execution model becomes a significant constraint due to its demanding memory requirements and runtime (hard to scale). However, many important algorithms do not demand a fully connected graph; these algorithms, primarily distributed in nature, align closely with the message-passing paradigm employed by Graph Neural Networks. Hence, we propose SALSA-CLRS, an extension of the current CLRS benchmark specifically with scalability and sparseness in mind. Our approach includes adapted algorithms from the original CLRS benchmark and introduces new problems from distributed and randomized algorithms. Moreover, we perform a thorough empirical evaluation of our benchmark. Code is publicly available at https://github.com/jkminder/SALSA-CLRS.
△ Less
Submitted 20 November, 2023; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Fast Feedforward Networks
Authors:
Peter Belcak,
Roger Wattenhofer
Abstract:
We break the linear link between the layer size and its inference cost by introducing the fast feedforward (FFF) architecture, a log-time alternative to feedforward networks. We demonstrate that FFFs are up to 220x faster than feedforward networks, up to 6x faster than mixture-of-experts networks, and exhibit better training properties than mixtures of experts thanks to noiseless conditional execu…
▽ More
We break the linear link between the layer size and its inference cost by introducing the fast feedforward (FFF) architecture, a log-time alternative to feedforward networks. We demonstrate that FFFs are up to 220x faster than feedforward networks, up to 6x faster than mixture-of-experts networks, and exhibit better training properties than mixtures of experts thanks to noiseless conditional execution. Pushing FFFs to the limit, we show that they can use as little as 1% of layer neurons for inference in vision transformers while preserving 94.2% of predictive performance.
△ Less
Submitted 18 September, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
An Interpretable and Attention-based Method for Gaze Estimation Using Electroencephalography
Authors:
Nina Weng,
Martyna Plomecka,
Manuel Kaufmann,
Ard Kastrati,
Roger Wattenhofer,
Nicolas Langer
Abstract:
Eye movements can reveal valuable insights into various aspects of human mental processes, physical well-being, and actions. Recently, several datasets have been made available that simultaneously record EEG activity and eye movements. This has triggered the development of various methods to predict gaze direction based on brain activity. However, most of these methods lack interpretability, which…
▽ More
Eye movements can reveal valuable insights into various aspects of human mental processes, physical well-being, and actions. Recently, several datasets have been made available that simultaneously record EEG activity and eye movements. This has triggered the development of various methods to predict gaze direction based on brain activity. However, most of these methods lack interpretability, which limits their technology acceptance. In this paper, we leverage a large data set of simultaneously measured Electroencephalography (EEG) and Eye tracking, proposing an interpretable model for gaze estimation from EEG data. More specifically, we present a novel attention-based deep learning framework for EEG signal analysis, which allows the network to focus on the most relevant information in the signal and discard problematic channels. Additionally, we provide a comprehensive evaluation of the presented framework, demonstrating its superiority over current methods in terms of accuracy and robustness. Finally, the study presents visualizations that explain the results of the analysis and highlights the potential of attention mechanism for improving the efficiency and effectiveness of EEG data analysis in a variety of applications.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Graphtester: Exploring Theoretical Boundaries of GNNs on Graph Datasets
Authors:
Eren Akbiyik,
Florian Grötschla,
Beni Egressy,
Roger Wattenhofer
Abstract:
Graph Neural Networks (GNNs) have emerged as a powerful tool for learning from graph-structured data. However, even state-of-the-art architectures have limitations on what structures they can distinguish, imposing theoretical limits on what the networks can achieve on different datasets. In this paper, we provide a new tool called Graphtester for a comprehensive analysis of the theoretical capabil…
▽ More
Graph Neural Networks (GNNs) have emerged as a powerful tool for learning from graph-structured data. However, even state-of-the-art architectures have limitations on what structures they can distinguish, imposing theoretical limits on what the networks can achieve on different datasets. In this paper, we provide a new tool called Graphtester for a comprehensive analysis of the theoretical capabilities of GNNs for various datasets, tasks, and scores. We use Graphtester to analyze over 40 different graph datasets, determining upper bounds on the performance of various GNNs based on the number of layers. Further, we show that the tool can also be used for Graph Transformers using positional node encodings, thereby expanding its scope. Finally, we demonstrate that features generated by Graphtester can be used for practical applications such as Graph Transformers, and provide a synthetic dataset to benchmark node and edge features, such as positional encodings. The package is freely available at the following URL: https://github.com/meakbiyik/graphtester.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
DISCO-10M: A Large-Scale Music Dataset
Authors:
Luca A. Lanzendörfer,
Florian Grötschla,
Emil Funke,
Roger Wattenhofer
Abstract:
Music datasets play a crucial role in advancing research in machine learning for music. However, existing music datasets suffer from limited size, accessibility, and lack of audio resources. To address these shortcomings, we present DISCO-10M, a novel and extensive music dataset that surpasses the largest previously available music dataset by an order of magnitude. To ensure high-quality data, we…
▽ More
Music datasets play a crucial role in advancing research in machine learning for music. However, existing music datasets suffer from limited size, accessibility, and lack of audio resources. To address these shortcomings, we present DISCO-10M, a novel and extensive music dataset that surpasses the largest previously available music dataset by an order of magnitude. To ensure high-quality data, we implement a multi-stage filtering process. This process incorporates similarities based on textual descriptions and audio embeddings. Moreover, we provide precomputed CLAP embeddings alongside DISCO-10M, facilitating direct application on various downstream tasks. These embeddings enable efficient exploration of machine learning applications on the provided data. With DISCO-10M, we aim to democratize and facilitate new research to help advance the development of novel machine learning models for music.
△ Less
Submitted 5 October, 2023; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Siamese SIREN: Audio Compression with Implicit Neural Representations
Authors:
Luca A. Lanzendörfer,
Roger Wattenhofer
Abstract:
Implicit Neural Representations (INRs) have emerged as a promising method for representing diverse data modalities, including 3D shapes, images, and audio. While recent research has demonstrated successful applications of INRs in image and 3D shape compression, their potential for audio compression remains largely unexplored. Motivated by this, we present a preliminary investigation into the use o…
▽ More
Implicit Neural Representations (INRs) have emerged as a promising method for representing diverse data modalities, including 3D shapes, images, and audio. While recent research has demonstrated successful applications of INRs in image and 3D shape compression, their potential for audio compression remains largely unexplored. Motivated by this, we present a preliminary investigation into the use of INRs for audio compression. Our study introduces Siamese SIREN, a novel approach based on the popular SIREN architecture. Our experimental results indicate that Siamese SIREN achieves superior audio reconstruction fidelity while utilizing fewer network parameters compared to previous INR architectures.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Provably Powerful Graph Neural Networks for Directed Multigraphs
Authors:
Béni Egressy,
Luc von Niederhäusern,
Jovan Blanusa,
Erik Altman,
Roger Wattenhofer,
Kubilay Atasu
Abstract:
This paper analyses a set of simple adaptations that transform standard message-passing Graph Neural Networks (GNN) into provably powerful directed multigraph neural networks. The adaptations include multigraph port numbering, ego IDs, and reverse message passing. We prove that the combination of these theoretically enables the detection of any directed subgraph pattern. To validate the effectiven…
▽ More
This paper analyses a set of simple adaptations that transform standard message-passing Graph Neural Networks (GNN) into provably powerful directed multigraph neural networks. The adaptations include multigraph port numbering, ego IDs, and reverse message passing. We prove that the combination of these theoretically enables the detection of any directed subgraph pattern. To validate the effectiveness of our proposed adaptations in practice, we conduct experiments on synthetic subgraph detection tasks, which demonstrate outstanding performance with almost perfect results. Moreover, we apply our proposed adaptations to two financial crime analysis tasks. We observe dramatic improvements in detecting money laundering transactions, improving the minority-class F1 score of a standard message-passing GNN by up to 30%, and closely matching or outperforming tree-based and GNN baselines. Similarly impressive results are observed on a real-world phishing detection dataset, boosting three standard GNNs' F1 scores by around 15% and outperforming all baselines.
△ Less
Submitted 4 January, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Ethereum Proof-of-Stake Consensus Layer: Participation and Decentralization
Authors:
Dominic Grandjean,
Lioba Heimbach,
Roger Wattenhofer
Abstract:
In September 2022, Ethereum transitioned from Proof-of-Work (PoW) to Proof-of-Stake (PoS) during "the merge" - making it the largest PoS cryptocurrency in terms of market capitalization. With this work, we present a comprehensive measurement study of the current state of the Ethereum PoS consensus layer on the beacon chain. We perform a longitudinal study of the history of the beacon chain. Our wo…
▽ More
In September 2022, Ethereum transitioned from Proof-of-Work (PoW) to Proof-of-Stake (PoS) during "the merge" - making it the largest PoS cryptocurrency in terms of market capitalization. With this work, we present a comprehensive measurement study of the current state of the Ethereum PoS consensus layer on the beacon chain. We perform a longitudinal study of the history of the beacon chain. Our work finds that all dips in network participation are caused by network upgrades, issues with major consensus clients, or issues with service operators controlling a large number of validators. Further, our longitudinal staking power decentralization analysis reveals that Ethereum PoS fairs similarly to its PoW counterpart in terms of decentralization and exhibits the immense impact of (liquid) staking services on staking power decentralization. Finally, we highlight the heightened security concerns in Ethereum PoS caused by high degrees of centralization.
△ Less
Submitted 22 September, 2023; v1 submitted 19 June, 2023;
originally announced June 2023.
-
The Potential of Self-Regulation for Front-Running Prevention on DEXes
Authors:
Lioba Heimbach,
Eric Schertenleib,
Roger Wattenhofer
Abstract:
The transaction ordering dependency of the smart contracts building decentralized exchanges (DEXes) allow for predatory trading strategies. In particular, front-running attacks present a constant risk for traders on DEXes. Whereas legal regulation outlaws most front-running practices in traditional finance, such measures are ineffective in preventing front-running on DEXes due to the absence of a…
▽ More
The transaction ordering dependency of the smart contracts building decentralized exchanges (DEXes) allow for predatory trading strategies. In particular, front-running attacks present a constant risk for traders on DEXes. Whereas legal regulation outlaws most front-running practices in traditional finance, such measures are ineffective in preventing front-running on DEXes due to the absence of a central authority. While novel market designs hindering front-running may emerge, it remains unclear whether the market's participants, in particular liquidity providers, would be willing to adopt these new designs. A misalignment of the participant's private incentives and the market's social incentives can hinder the market from adopting an effective prevention mechanism.
We present a game-theoretic model to study the behavior of traders and liquidity providers in DEXes. Our work finds that in most market configurations, the private interests of traders and liquidity providers align with the market's social incentives - eliminating front-running attacks. However, even though liquidity providers generally benefit from embracing the market that prevents front-running, the benefit is often small and may not suffice to entice them to change strategy in reality. Thus, we find that inert liquidity providers might require additional incentives to adopt innovative market designs and permit the market's successful self-regulation.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Examining the Emergence of Deductive Reasoning in Generative Language Models
Authors:
Peter Belcak,
Luca A. Lanzendörfer,
Roger Wattenhofer
Abstract:
We conduct a preliminary inquiry into the ability of generative transformer models to deductively reason from premises provided. We observe notable differences in the performance of models coming from different training setups and find that the deductive reasoning ability increases with scale. Further, we discover that the performance generally does not decrease with the length of the deductive ch…
▽ More
We conduct a preliminary inquiry into the ability of generative transformer models to deductively reason from premises provided. We observe notable differences in the performance of models coming from different training setups and find that the deductive reasoning ability increases with scale. Further, we discover that the performance generally does not decrease with the length of the deductive chain needed to reach the conclusion, with the exception of OpenAI GPT-3 and GPT-3.5 models. Our study considers a wide variety of transformer-decoder models, ranging from 117 million to 175 billion parameters in size.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Ethereum's Proposer-Builder Separation: Promises and Realities
Authors:
Lioba Heimbach,
Lucianna Kiffer,
Christof Ferreira Torres,
Roger Wattenhofer
Abstract:
With Ethereum's transition from Proof-of-Work to Proof-of-Stake in September 2022 came another paradigm shift, the Proposer-Builder Separation (PBS) scheme. PBS was introduced to decouple the roles of selecting and ordering transactions in a block (i.e., the builder), from those validating its contents and proposing the block to the network as the new head of the blockchain (i.e., the proposer). I…
▽ More
With Ethereum's transition from Proof-of-Work to Proof-of-Stake in September 2022 came another paradigm shift, the Proposer-Builder Separation (PBS) scheme. PBS was introduced to decouple the roles of selecting and ordering transactions in a block (i.e., the builder), from those validating its contents and proposing the block to the network as the new head of the blockchain (i.e., the proposer). In this landscape, proposers are the validators in the Proof-of-Stake consensus protocol, while now relying on specialized block builders for creating blocks with the highest value for the proposer. Additionally, relays act as mediators between builders and proposers. We study PBS adoption and show that the current landscape exhibits significant centralization amongst the builders and relays. Further, we explore whether PBS effectively achieves its intended objectives of enabling hobbyist validators to maximize block profitability and preventing censorship. Our findings reveal that although PBS grants validators the opportunity to access optimized and competitive blocks, it tends to stimulate censorship rather than reduce it. Additionally, we demonstrate that relays do not consistently uphold their commitments and may prove unreliable. Specifically, proposers do not always receive the complete promised value, and the censorship or filtering capabilities pledged by relays exhibit significant gaps.
△ Less
Submitted 24 September, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Cascaded Beam Search: Plug-and-Play Terminology-Forcing For Neural Machine Translation
Authors:
Frédéric Odermatt,
Béni Egressy,
Roger Wattenhofer
Abstract:
This paper presents a plug-and-play approach for translation with terminology constraints. Terminology constraints are an important aspect of many modern translation pipelines. In both specialized domains and newly emerging domains (such as the COVID-19 pandemic), accurate translation of technical terms is crucial. Recent approaches often train models to copy terminologies from the input into the…
▽ More
This paper presents a plug-and-play approach for translation with terminology constraints. Terminology constraints are an important aspect of many modern translation pipelines. In both specialized domains and newly emerging domains (such as the COVID-19 pandemic), accurate translation of technical terms is crucial. Recent approaches often train models to copy terminologies from the input into the output sentence by feeding the target terminology along with the input. But this requires expensive training whenever the underlying language model is changed or the system should specialize to a new domain. We propose Cascade Beam Search, a plug-and-play terminology-forcing approach that requires no training. Cascade Beam Search has two parts: 1) logit manipulation to increase the probability of target terminologies and 2) a cascading beam setup based on grid beam search, where beams are grouped by the number of terminologies they contain. We evaluate the performance of our approach by competing against the top submissions of the WMT21 terminology translation task. Our plug-and-play approach performs on par with the winning submissions without using a domain-specific language model and with no additional training.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Stable Dinner Party Seating Arrangements
Authors:
Damien Berriaud,
Andrei Constantinescu,
Roger Wattenhofer
Abstract:
A group of $n$ agents with numerical preferences for each other are to be assigned to the $n$ seats of a dining table. We study two natural topologies:~circular (cycle) tables and panel (path) tables. For a given seating arrangement, an agent's utility is the sum of their preference values towards their (at most two) direct neighbors. An arrangement is envy-free if no agent strictly prefers someon…
▽ More
A group of $n$ agents with numerical preferences for each other are to be assigned to the $n$ seats of a dining table. We study two natural topologies:~circular (cycle) tables and panel (path) tables. For a given seating arrangement, an agent's utility is the sum of their preference values towards their (at most two) direct neighbors. An arrangement is envy-free if no agent strictly prefers someone else's seat, and it is stable if no two agents strictly prefer each other's seats. Recently, it was shown that for both paths and cycles it is NP-hard to decide whether an envy-free arrangement exists, even for symmetric binary preferences. In contrast, we show that, if agents come from a bounded number of classes, the problem is solvable in polynomial time for arbitrarily-valued possibly asymmetric preferences, including outputting an arrangement if possible. We also give simpler proofs of the previous hardness results if preferences are allowed to be asymmetric. For stability, it is known that deciding the existence of stable arrangements is NP-hard for both topologies, but only if sufficiently-many numerical values are allowed. As it turns out, even constructing unstable instances can be challenging in certain cases, e.g., binary values. We completely characterize the existence of stable arrangements based on the number of distinct values in the preference matrix and the number of agent classes. We also ask the same question for non-negative values and give an almost-complete characterization, the most interesting outstanding case being that of paths with two-valued non-negative preferences, for which we experimentally find that stable arrangements always exist and prove it under the additional constraint that agents can only swap seats when sitting at most two positions away. We moreover give a polynomial algorithm for determining a stable arrangement assuming a bounded number of classes.
△ Less
Submitted 6 October, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
A Fair and Resilient Decentralized Clock Network for Transaction Ordering
Authors:
Andrei Constantinescu,
Diana Ghinea,
Lioba Heimbach,
Zilin Wang,
Roger Wattenhofer
Abstract:
Traditional blockchain design gives miners or validators full control over transaction ordering, i.e., they can freely choose which transactions to include or exclude, as well as in which order. While not an issue initially, the emergence of decentralized finance has introduced new transaction order dependencies allowing parties in control of the ordering to make a profit by front-running others'…
▽ More
Traditional blockchain design gives miners or validators full control over transaction ordering, i.e., they can freely choose which transactions to include or exclude, as well as in which order. While not an issue initially, the emergence of decentralized finance has introduced new transaction order dependencies allowing parties in control of the ordering to make a profit by front-running others' transactions. In this work, we present the Decentralized Clock Network, a new approach for achieving fair transaction ordering. Users submit their transactions to the network's clocks, which run an agreement protocol that provides each transaction with a timestamp of receipt which is then used to define the transactions' order. By separating agreement from ordering, our protocol is efficient and has a simpler design compared to other available solutions. Moreover, our protocol brings to the blockchain world the paradigm of asynchronous fallback, where the algorithm operates with stronger fairness guarantees during periods of synchronous use, switching to an asynchronous mode only during times of increased network delay.
△ Less
Submitted 18 December, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Discovering Graph Generation Algorithms
Authors:
Mihai Babiac,
Karolis Martinkus,
Roger Wattenhofer
Abstract:
We provide a novel approach to construct generative models for graphs. Instead of using the traditional probabilistic models or deep generative models, we propose to instead find an algorithm that generates the data. We achieve this using evolutionary search and a powerful fitness function, implemented by a randomly initialized graph neural network. This brings certain advantages over current deep…
▽ More
We provide a novel approach to construct generative models for graphs. Instead of using the traditional probabilistic models or deep generative models, we propose to instead find an algorithm that generates the data. We achieve this using evolutionary search and a powerful fitness function, implemented by a randomly initialized graph neural network. This brings certain advantages over current deep generative models, for instance, a higher potential for out-of-training-distribution generalization and direct interpretability, as the final graph generative process is expressed as a Python function. We show that this approach can be competitive with deep generative models and under some circumstances can even find the true graph generative process, and as such perfectly generalize.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
DeFi Lending During The Merge
Authors:
Lioba Heimbach,
Eric Schertenleib,
Roger Wattenhofer
Abstract:
Lending protocols in decentralized finance enable the permissionless exchange of capital from lenders to borrowers without relying on a trusted third party for clearing or market-making. Interest rates are set by the supply and demand of capital according to a pre-defined function. In the lead-up to The Merge: Ethereum blockchain's transition from proof-of-work (PoW) to proof-of-stake (PoS), a fra…
▽ More
Lending protocols in decentralized finance enable the permissionless exchange of capital from lenders to borrowers without relying on a trusted third party for clearing or market-making. Interest rates are set by the supply and demand of capital according to a pre-defined function. In the lead-up to The Merge: Ethereum blockchain's transition from proof-of-work (PoW) to proof-of-stake (PoS), a fraction of the Ethereum ecosystem announced plans of continuing with a PoW-chain. Owners of ETH - whether their ETH was borrowed or not - would hold the native tokens on each chain. This development alarmed lending protocols. They feared spiking ETH borrowing rates would lead to mass liquidations which could undermine their viability. Thus, the decentralized autonomous organization running the protocols saw no alternative to intervention - restricting users' ability to borrow.
We investigate the effects of the merge and the aforementioned intervention on the two biggest lending protocols on Ethereum: AAVE and Compound. Our analysis finds that borrowing rates were extremely volatile, jum** by two orders of magnitude, and borrowing at times reached 100% of the available funds. Despite this, no spike in mass liquidations or irretrievable loans materialized. Further, we are the first to quantify and analyze hard-fork-arbitrage, profiting from holding debt in the native blockchain token during a hard fork. We find that arbitrageurs transferred tokens to centralized exchanges which at the time were worth more than 13 Mio US$, money that was effectively extracted from the platforms' lenders.
△ Less
Submitted 16 August, 2023; v1 submitted 23 January, 2023;
originally announced March 2023.
-
Abstract Visual Reasoning Enabled by Language
Authors:
Giacomo Camposampiero,
Loic Houmard,
Benjamin Estermann,
Joël Mathys,
Roger Wattenhofer
Abstract:
While artificial intelligence (AI) models have achieved human or even superhuman performance in many well-defined applications, they still struggle to show signs of broad and flexible intelligence. The Abstraction and Reasoning Corpus (ARC), a visual intelligence benchmark introduced by François Chollet, aims to assess how close AI systems are to human-like cognitive abilities. Most current approa…
▽ More
While artificial intelligence (AI) models have achieved human or even superhuman performance in many well-defined applications, they still struggle to show signs of broad and flexible intelligence. The Abstraction and Reasoning Corpus (ARC), a visual intelligence benchmark introduced by François Chollet, aims to assess how close AI systems are to human-like cognitive abilities. Most current approaches rely on carefully handcrafted domain-specific program searches to brute-force solutions for the tasks present in ARC. In this work, we propose a general learning-based framework for solving ARC. It is centered on transforming tasks from the vision to the language domain. This composition of language and vision allows for pre-trained models to be leveraged at each stage, enabling a shift from handcrafted priors towards the learned priors of the models. While not yet beating state-of-the-art models on ARC, we demonstrate the potential of our approach, for instance, by solving some ARC tasks that have not been solved previously.
△ Less
Submitted 22 June, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
DAVA: Disentangling Adversarial Variational Autoencoder
Authors:
Benjamin Estermann,
Roger Wattenhofer
Abstract:
The use of well-disentangled representations offers many advantages for downstream tasks, e.g. an increased sample efficiency, or better interpretability. However, the quality of disentangled interpretations is often highly dependent on the choice of dataset-specific hyperparameters, in particular the regularization strength. To address this issue, we introduce DAVA, a novel training procedure for…
▽ More
The use of well-disentangled representations offers many advantages for downstream tasks, e.g. an increased sample efficiency, or better interpretability. However, the quality of disentangled interpretations is often highly dependent on the choice of dataset-specific hyperparameters, in particular the regularization strength. To address this issue, we introduce DAVA, a novel training procedure for variational auto-encoders. DAVA completely alleviates the problem of hyperparameter selection. We compare DAVA to models with optimal hyperparameters. Without any hyperparameter tuning, DAVA is competitive on a diverse range of commonly used datasets. Underlying DAVA, we discover a necessary condition for unsupervised disentanglement, which we call PIPE. We demonstrate the ability of PIPE to positively predict the performance of downstream models in abstract reasoning. We also thoroughly investigate correlations with existing supervised and unsupervised metrics. The code is available at https://github.com/besterma/dava.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Computing the Best Policy That Survives a Vote
Authors:
Andrei Constantinescu,
Roger Wattenhofer
Abstract:
An assembly of $n$ voters needs to decide on $t$ independent binary issues. Each voter has opinions about the issues, given by a $t$-bit vector. Anscombe's paradox shows that a policy following the majority opinion in each issue may not survive a vote by the very same set of $n$ voters, i.e., more voters may feel unrepresented by such a majority-driven policy than represented. A natural resolution…
▽ More
An assembly of $n$ voters needs to decide on $t$ independent binary issues. Each voter has opinions about the issues, given by a $t$-bit vector. Anscombe's paradox shows that a policy following the majority opinion in each issue may not survive a vote by the very same set of $n$ voters, i.e., more voters may feel unrepresented by such a majority-driven policy than represented. A natural resolution is to come up with a policy that deviates a bit from the majority policy but no longer gets more opposition than support from the electorate. We show that a Hamming distance to the majority policy of at most $\lfloor (t - 1) / 2 \rfloor$ can always be guaranteed, by giving a new probabilistic argument relying on structure-preserving symmetries of the space of potential policies. Unless the electorate is evenly divided between the two options on all issues, we in fact show that a policy strictly winning the vote exists within this distance bound. Our approach also leads to a deterministic polynomial-time algorithm for finding policies with the stated guarantees, answering an open problem of previous work. For odd $t$, unless we are in the pathological case described above, we also give a simpler and more efficient algorithm running in expected polynomial time with the same guarantees. We further show that checking whether distance strictly less than $\lfloor (t - 1) /2 \rfloor$ can be achieved is NP-hard, and that checking for distance at most some input $k$ is FPT with respect to several natural parameters.
△ Less
Submitted 2 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Electrode Clustering and Bandpass Analysis of EEG Data for Gaze Estimation
Authors:
Ard Kastrati,
Martyna Beata Plomecka,
Joël Küchler,
Nicolas Langer,
Roger Wattenhofer
Abstract:
In this study, we validate the findings of previously published papers, showing the feasibility of an Electroencephalography (EEG) based gaze estimation. Moreover, we extend previous research by demonstrating that with only a slight drop in model performance, we can significantly reduce the number of electrodes, indicating that a high-density, expensive EEG cap is not necessary for the purposes of…
▽ More
In this study, we validate the findings of previously published papers, showing the feasibility of an Electroencephalography (EEG) based gaze estimation. Moreover, we extend previous research by demonstrating that with only a slight drop in model performance, we can significantly reduce the number of electrodes, indicating that a high-density, expensive EEG cap is not necessary for the purposes of EEG-based eye tracking. Using data-driven approaches, we establish which electrode clusters impact gaze estimation and how the different types of EEG data preprocessing affect the models' performance. Finally, we also inspect which recorded frequencies are most important for the defined tasks.
△ Less
Submitted 19 February, 2023;
originally announced February 2023.
-
The Hidden Shortcomings of (D)AOs -- An Empirical Study of On-Chain Governance
Authors:
Rainer Feichtinger,
Robin Fritsch,
Yann Vonlanthen,
Roger Wattenhofer
Abstract:
Decentralized autonomous organizations (DAOs) are a recent innovation in organizational structures, which are already widely used in the blockchain ecosystem. We empirically study the on-chain governance systems of 21 DAOs and open source the live dataset. The DAOs we study are of various size and activity, and govern a wide range of protocols and services, such as decentralized exchanges, lending…
▽ More
Decentralized autonomous organizations (DAOs) are a recent innovation in organizational structures, which are already widely used in the blockchain ecosystem. We empirically study the on-chain governance systems of 21 DAOs and open source the live dataset. The DAOs we study are of various size and activity, and govern a wide range of protocols and services, such as decentralized exchanges, lending protocols, infrastructure projects and common goods funding. Our analysis unveils a high concentration of voting rights, a significant hidden monetary costs of on-chain governance systems, as well as a remarkably high amount of pointless governance activity.
△ Less
Submitted 28 February, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
DeFi and NFTs Hinder Blockchain Scalability
Authors:
Lioba Heimbach,
Quentin Kniep,
Yann Vonlanthen,
Roger Wattenhofer
Abstract:
Many classical blockchains are known to have an embarrassingly low transaction throughput, down to Bitcoin's notorious seven transactions per second limit.Various proposals and implementations for increasing throughput emerged in the first decade of blockchain research. But how much concurrency is possible? In their early days, blockchains were mostly used for simple transfers from user to user. M…
▽ More
Many classical blockchains are known to have an embarrassingly low transaction throughput, down to Bitcoin's notorious seven transactions per second limit.Various proposals and implementations for increasing throughput emerged in the first decade of blockchain research. But how much concurrency is possible? In their early days, blockchains were mostly used for simple transfers from user to user. More recently, however, decentralized finance (DeFi) and NFT marketplaces have completely changed what is happening on blockchains. Both are built using smart contracts and have gained significant popularity. Transactions on DeFi and NFT marketplaces often interact with the same smart contracts. We believe this development has transformed blockchain usage. In our work, we perform a historical analysis of Ethereum's transaction graph. We study how much interaction between transactions there was historically and how much there is now. We find that the rise of DeFi and NFT marketplaces has led to an increase in "centralization" in the transaction graph. More transactions are now interconnected: currently there are around 200 transactions per block with 4000 interdependencies between them. We further find that the parallelizability of Ethereum's current interconnected transaction workload is limited. A speedup exceeding a factor of five is currently unrealistic.
△ Less
Submitted 7 March, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Short Squeeze in DeFi Lending Market: Decentralization in Jeopardy?
Authors:
Lioba Heimbach,
Eric G. Schertenleib,
Roger Wattenhofer
Abstract:
Anxiety levels in the Aave community spiked in November 2022 as Avi Eisenberg performed an attack on Aave. Eisenberg attempted to short the CRV token by using funds borrowed on the protocol to artificially deflate the value of CRV. While the attack was ultimately unsuccessful, it left the Aave community scared and even raised question marks regarding the feasibility of large lending platforms unde…
▽ More
Anxiety levels in the Aave community spiked in November 2022 as Avi Eisenberg performed an attack on Aave. Eisenberg attempted to short the CRV token by using funds borrowed on the protocol to artificially deflate the value of CRV. While the attack was ultimately unsuccessful, it left the Aave community scared and even raised question marks regarding the feasibility of large lending platforms under decentralized governance.
In this work, we analyze Avi Eisenberg's actions and show how he was able to artificially lower the price of CRV by selling large quantities of borrowed CRV for stablecoins on both decentralized and centralized exchanges. Despite the failure of his attack, it still led to irretrievable debt worth more than 1.5 Mio USD at the time and, thereby, quadrupled the protocol's irretrievable debt. Furthermore, we highlight that his attack was enabled by the vast proportion of CRV available to borrow as well as Aave's lending protocol design hindering rapid intervention. We stress Eisenberg's attack exposes a predicament of large DeFi lending protocols: limit the scope or compromise on 'decentralization'.
△ Less
Submitted 21 June, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
FedHQL: Federated Heterogeneous Q-Learning
Authors:
Flint Xiaofeng Fan,
Yining Ma,
Zhongxiang Dai,
Cheston Tan,
Bryan Kian Hsiang Low,
Roger Wattenhofer
Abstract:
Federated Reinforcement Learning (FedRL) encourages distributed agents to learn collectively from each other's experience to improve their performance without exchanging their raw trajectories. The existing work on FedRL assumes that all participating agents are homogeneous, which requires all agents to share the same policy parameterization (e.g., network architectures and training configurations…
▽ More
Federated Reinforcement Learning (FedRL) encourages distributed agents to learn collectively from each other's experience to improve their performance without exchanging their raw trajectories. The existing work on FedRL assumes that all participating agents are homogeneous, which requires all agents to share the same policy parameterization (e.g., network architectures and training configurations). However, in real-world applications, agents are often in disagreement about the architecture and the parameters, possibly also because of disparate computational budgets. Because homogeneity is not given in practice, we introduce the problem setting of Federated Reinforcement Learning with Heterogeneous And bLack-box agEnts (FedRL-HALE). We present the unique challenges this new setting poses and propose the Federated Heterogeneous Q-Learning (FedHQL) algorithm that principally addresses these challenges. We empirically demonstrate the efficacy of FedHQL in boosting the sample efficiency of heterogeneous agents with distinct policy parameterization using standard RL tasks.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
Learning Graph Algorithms With Recurrent Graph Neural Networks
Authors:
Florian Grötschla,
Joël Mathys,
Roger Wattenhofer
Abstract:
Classical graph algorithms work well for combinatorial problems that can be thoroughly formalized and abstracted. Once the algorithm is derived, it generalizes to instances of any size. However, develo** an algorithm that handles complex structures and interactions in the real world can be challenging. Rather than specifying the algorithm, we can try to learn it from the graph-structured data. G…
▽ More
Classical graph algorithms work well for combinatorial problems that can be thoroughly formalized and abstracted. Once the algorithm is derived, it generalizes to instances of any size. However, develo** an algorithm that handles complex structures and interactions in the real world can be challenging. Rather than specifying the algorithm, we can try to learn it from the graph-structured data. Graph Neural Networks (GNNs) are inherently capable of working on graph structures; however, they struggle to generalize well, and learning on larger instances is challenging. In order to scale, we focus on a recurrent architecture design that can learn simple graph problems end to end on smaller graphs and then extrapolate to larger instances. As our main contribution, we identify three essential techniques for recurrent GNNs to scale. By using (i) skip connections, (ii) state regularization, and (iii) edge convolutions, we can guide GNNs toward extrapolation. This allows us to train on small graphs and apply the same model to much larger graphs during inference. Moreover, we empirically validate the extrapolation capabilities of our GNNs on algorithmic datasets.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
Automating Rigid Origami Design
Authors:
Jeremia Geiger,
Karolis Martinkus,
Oliver Richter,
Roger Wattenhofer
Abstract:
Rigid origami has shown potential in large diversity of practical applications. However, current rigid origami crease pattern design mostly relies on known tessellations. This strongly limits the diversity and novelty of patterns that can be created. In this work, we build upon the recently developed principle of three units method to formulate rigid origami design as a discrete optimization probl…
▽ More
Rigid origami has shown potential in large diversity of practical applications. However, current rigid origami crease pattern design mostly relies on known tessellations. This strongly limits the diversity and novelty of patterns that can be created. In this work, we build upon the recently developed principle of three units method to formulate rigid origami design as a discrete optimization problem, the rigid origami game. Our implementation allows for a simple definition of diverse objectives and thereby expands the potential of rigid origami further to optimized, application-specific crease patterns. We showcase the flexibility of our formulation through use of a diverse set of search methods in several illustrative case studies. We are not only able to construct various patterns that approximate given target shapes, but to also specify abstract, function-based rewards which result in novel, foldable and functional designs for everyday objects.
△ Less
Submitted 28 April, 2023; v1 submitted 20 November, 2022;
originally announced November 2022.
-
Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations
Authors:
Yu Fei,
** Nie,
Zhao Meng,
Roger Wattenhofer,
Mrinmaya Sachan
Abstract:
Recent work has demonstrated that pre-trained language models (PLMs) are zero-shot learners. However, most existing zero-shot methods involve heavy human engineering or complicated self-training pipelines, hindering their application to new situations. In this work, we show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of PLMs. Specifically,…
▽ More
Recent work has demonstrated that pre-trained language models (PLMs) are zero-shot learners. However, most existing zero-shot methods involve heavy human engineering or complicated self-training pipelines, hindering their application to new situations. In this work, we show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of PLMs. Specifically, we fit the unlabeled texts with a Bayesian Gaussian Mixture Model after initializing cluster positions and shapes using class names. Despite its simplicity, this approach achieves superior or comparable performance on both topic and sentiment classification datasets and outperforms prior works significantly on unbalanced datasets. We further explore the applicability of our clustering approach by evaluating it on 14 datasets with more diverse topics, text lengths, and numbers of classes. Our approach achieves an average of 20% absolute improvement over prompt-based zero-shot learning. Finally, we compare different PLM embedding spaces and find that texts are well-clustered by topics even if the PLM is not explicitly pre-trained to generate meaningful sentence embeddings. This work indicates that PLM embeddings can categorize texts without task-specific fine-tuning, thus providing a new way to analyze and utilize their knowledge and zero-shot learning ability.
△ Less
Submitted 23 November, 2022; v1 submitted 29 October, 2022;
originally announced October 2022.
-
Neural Combinatorial Logic Circuit Synthesis from Input-Output Examples
Authors:
Peter Belcak,
Roger Wattenhofer
Abstract:
We propose a novel, fully explainable neural approach to synthesis of combinatorial logic circuits from input-output examples. The carrying advantage of our method is that it readily extends to inductive scenarios, where the set of examples is incomplete but still indicative of the desired behaviour. Our method can be employed for a virtually arbitrary choice of atoms - from logic gates to FPGA bl…
▽ More
We propose a novel, fully explainable neural approach to synthesis of combinatorial logic circuits from input-output examples. The carrying advantage of our method is that it readily extends to inductive scenarios, where the set of examples is incomplete but still indicative of the desired behaviour. Our method can be employed for a virtually arbitrary choice of atoms - from logic gates to FPGA blocks - as long as they can be formulated in a differentiable fashion, and consistently yields good results for synthesis of practical circuits of increasing size. In particular, we succeed in learning a number of arithmetic, bitwise, and signal-routing operations, and even generalise towards the correct behaviour in inductive scenarios. Our method, attacking a discrete logical synthesis problem with an explainable neural approach, hints at a wider promise for synthesis and reasoning-related tasks.
△ Less
Submitted 29 October, 2022;
originally announced October 2022.