-
Does It Spin? On the Adoption and Use of QUIC's Spin Bit
Authors:
Ike Kunze,
Constantin Sander,
Klaus Wehrle
Abstract:
Encrypted QUIC traffic complicates network management as traditional transport layer semantics can no longer be used for RTT or packet loss measurements. Addressing this challenge, QUIC includes an optional, carefully designed mechanism: the spin bit. While its capabilities have already been studied in test settings, its real-world usefulness and adoption are unknown. In this paper, we thus invest…
▽ More
Encrypted QUIC traffic complicates network management as traditional transport layer semantics can no longer be used for RTT or packet loss measurements. Addressing this challenge, QUIC includes an optional, carefully designed mechanism: the spin bit. While its capabilities have already been studied in test settings, its real-world usefulness and adoption are unknown. In this paper, we thus investigate the spin bit's deployment and utility on the web.
Analyzing our long-term measurements of more than 200M domains, we find that the spin bit is enabled on ~10% of those with QUIC support and for ~50% / 60% of the underlying IPv4 / IPv6 hosts. The support is mainly driven by medium-sized cloud providers while most hyperscalers do not implement it. Assessing the utility of spin bit RTT measurements, the theoretical issue of reordering does not significantly manifest in our study and the spin bit provides accurate estimates for around 30.5% of connections using the mechanism, but drastically overestimates the RTT for another 51.7%. Overall, we conclude that the spin bit, even though an optional feature, indeed sees use in the wild and is able to provide reasonable RTT estimates for a solid share of QUIC connections, but requires solutions for making its measurements more robust.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
ECN with QUIC: Challenges in the Wild
Authors:
Constantin Sander,
Ike Kunze,
Leo Blöcher,
Mike Kosek,
Klaus Wehrle
Abstract:
TCP and QUIC can both leverage ECN to avoid congestion loss and its retransmission overhead. However, both protocols require support of their remote endpoints and it took two decades since the initial standardization of ECN for TCP to reach 80% ECN support and more in the wild. In contrast, the QUIC standard mandates ECN support, but there are notable ambiguities that make it unclear if and how EC…
▽ More
TCP and QUIC can both leverage ECN to avoid congestion loss and its retransmission overhead. However, both protocols require support of their remote endpoints and it took two decades since the initial standardization of ECN for TCP to reach 80% ECN support and more in the wild. In contrast, the QUIC standard mandates ECN support, but there are notable ambiguities that make it unclear if and how ECN can actually be used with QUIC on the Internet. Hence, in this paper, we analyze ECN support with QUIC in the wild: We conduct repeated measurements on more than 180M domains to identify HTTP/3 websites and analyze the underlying QUIC connections w.r.t. ECN support. We only find 20% of QUIC hosts, providing 6% of HTTP/3 websites, to mirror client ECN codepoints. Yet, mirroring ECN is only half of what is required for ECN with QUIC, as QUIC validates mirrored ECN codepoints to detect network impairments: We observe that less than 2% of QUIC hosts, providing less than 0.3% of HTTP/3 websites, pass this validation. We identify possible root causes in content providers not supporting ECN via QUIC and network impairments hindering ECN. We thus also characterize ECN with QUIC distributedly to traverse other paths and discuss our results w.r.t. QUIC and ECN innovations beyond QUIC.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Secrets Revealed in Container Images: An Internet-wide Study on Occurrence and Impact
Authors:
Markus Dahlmanns,
Constantin Sander,
Robin Decker,
Klaus Wehrle
Abstract:
Containerization allows bundling applications and their dependencies into a single image. The containerization framework Docker eases the use of this concept and enables sharing images publicly, gaining high momentum. However, it can lead to users creating and sharing images that include private keys or API secrets-either by mistake or out of negligence. This leakage impairs the creator's security…
▽ More
Containerization allows bundling applications and their dependencies into a single image. The containerization framework Docker eases the use of this concept and enables sharing images publicly, gaining high momentum. However, it can lead to users creating and sharing images that include private keys or API secrets-either by mistake or out of negligence. This leakage impairs the creator's security and that of everyone using the image. Yet, the extent of this practice and how to counteract it remains unclear.
In this paper, we analyze 337,171 images from Docker Hub and 8,076 other private registries unveiling that 8.5% of images indeed include secrets. Specifically, we find 52,107 private keys and 3,158 leaked API secrets, both opening a large attack surface, i.e., putting authentication and confidentiality of privacy-sensitive data at stake and even allow active attacks. We further document that those leaked keys are used in the wild: While we discovered 1,060 certificates relying on compromised keys being issued by public certificate authorities, based on further active Internet measurements, we find 275,269 TLS and SSH hosts using leaked private keys for authentication. To counteract this issue, we discuss how our methodology can be used to prevent secret leakage and reuse.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
Tracking the QUIC Spin Bit on Tofino
Authors:
Ike Kunze,
Constantin Sander,
Klaus Wehrle,
Jan Rüth
Abstract:
QUIC offers security and privacy for modern web traffic by closely integrating encryption into its transport functionality. In this process, it hides transport layer information often used for network monitoring, thus obsoleting traditional measurement concepts. To still enable passive RTT estimations, QUIC introduces a dedicated measurement bit - the spin bit. While simple in its design, tracking…
▽ More
QUIC offers security and privacy for modern web traffic by closely integrating encryption into its transport functionality. In this process, it hides transport layer information often used for network monitoring, thus obsoleting traditional measurement concepts. To still enable passive RTT estimations, QUIC introduces a dedicated measurement bit - the spin bit. While simple in its design, tracking the spin bit at line-rate can become challenging for software-based solutions. Dedicated hardware trackers are also unsuitable as the spin bit is not invariant and can change in the future. Thus, this paper investigates whether P4-programmable hardware, such as the Intel Tofino, can effectively track the spin bit at line-rate. We find that the core functionality of the spin bit can be realized easily, and our prototype has an accuracy close to software-based trackers. Our prototype further protects against faulty measurements caused by reordering and prepares the data according to the needs of network operators, e.g., by classifying samples into pre-defined RTT classes. Still, distinct concepts in QUIC, such as its connection ID, are challenging with current hardware capabilities.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
Sharding and HTTP/2 Connection Reuse Revisited: Why Are There Still Redundant Connections?
Authors:
Constantin Sander,
Leo Blöcher,
Klaus Wehrle,
Jan Rüth
Abstract:
HTTP/2 and HTTP/3 avoid concurrent connections but instead multiplex requests over a single connection. Besides enabling new features, this reduces overhead and enables fair bandwidth sharing. Redundant connections should hence be a story of the past with HTTP/2. However, they still exist, potentially hindering innovation and performance. Thus, we measure their spread and analyze their causes in t…
▽ More
HTTP/2 and HTTP/3 avoid concurrent connections but instead multiplex requests over a single connection. Besides enabling new features, this reduces overhead and enables fair bandwidth sharing. Redundant connections should hence be a story of the past with HTTP/2. However, they still exist, potentially hindering innovation and performance. Thus, we measure their spread and analyze their causes in this paper. We find that 36% - 72% of the 6.24M HTTP Archive and 78% of the Alexa Top 100k websites cause Chromium-based webbrowsers to open superfluous connections. We mainly attribute these to domain sharding, despite HTTP/2 efforts to revert it, and DNS load balancing, but also the Fetch Standard.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Video Conferencing and Flow-Rate Fairness: A First Look at Zoom and the Impact of Flow-Queuing AQM
Authors:
Constantin Sander,
Ike Kunze,
Klaus Wehrle,
Jan Rüth
Abstract:
Congestion control is essential for the stability of the Internet and the corresponding algorithms are commonly evaluated for interoperability based on flow-rate fairness. In contrast, video conferencing software such as Zoom uses custom congestion control algorithms whose fairness behavior is mostly unknown. Aggravatingly, video conferencing has recently seen a drastic increase in use - partly ca…
▽ More
Congestion control is essential for the stability of the Internet and the corresponding algorithms are commonly evaluated for interoperability based on flow-rate fairness. In contrast, video conferencing software such as Zoom uses custom congestion control algorithms whose fairness behavior is mostly unknown. Aggravatingly, video conferencing has recently seen a drastic increase in use - partly caused by the COVID-19 pandemic - and could hence negatively affect how available Internet resources are shared. In this paper, we thus investigate the flow-rate fairness of video conferencing congestion control at the example of Zoom and influences of deploying AQM. We find that Zoom is slow to react to bandwidth changes and uses two to three times the bandwidth of TCP in low-bandwidth scenarios. Moreover, also when competing with delay aware congestion control such as BBR, we see high queuing delays. AQM reduces these queuing delays and can equalize the bandwidth use when used with flow-queuing. However, it then introduces high packet loss for Zoom, leaving the question how delay and loss affect Zoom's QoE. We hence show a preliminary user study in the appendix which indicates that the QoE is at least not improved and should be studied further.
△ Less
Submitted 2 July, 2021;
originally announced July 2021.
-
Inference of cell dynamics on perturbation data using adjoint sensitivity
Authors:
Weiqi Ji,
Bo Yuan,
Ciyue Shen,
Aviv Regev,
Chris Sander,
Sili Deng
Abstract:
Data-driven dynamic models of cell biology can be used to predict cell response to unseen perturbations. Recent work (CellBox) had demonstrated the derivation of interpretable models with explicit interaction terms, in which the parameters were optimized using machine learning techniques. While the previous work was tested only in a single biological setting, this work aims to extend the range of…
▽ More
Data-driven dynamic models of cell biology can be used to predict cell response to unseen perturbations. Recent work (CellBox) had demonstrated the derivation of interpretable models with explicit interaction terms, in which the parameters were optimized using machine learning techniques. While the previous work was tested only in a single biological setting, this work aims to extend the range of applicability of this model inference approach to a diversity of biological systems. Here we adapted CellBox in Julia differential programming and augmented the method with adjoint algorithms, which has recently been used in the context of neural ODEs. We trained the models using simulated data from both abstract and biology-inspired networks, which afford the ability to evaluate the recovery of the ground truth network structure. The resulting accuracy of prediction by these models is high both in terms of low error against data and excellent agreement with the network structure used for the simulated training data. While there is no analogous ground truth for real life biological systems, this work demonstrates the ability to construct and parameterize a considerable diversity of network models with high predictive ability. The expectation is that this kind of procedure can be used on real perturbation-response data to derive models applicable to diverse biological systems.
△ Less
Submitted 13 April, 2021;
originally announced April 2021.
-
DeePCCI: Deep Learning-based Passive Congestion Control Identification
Authors:
Constantin Sander,
Jan Rüth,
Oliver Hohlfeld,
Klaus Wehrle
Abstract:
Transport protocols use congestion control to avoid overloading a network. Nowadays, different congestion control variants exist that influence performance. Studying their use is thus relevant, but it is hard to identify which variant is used. While passive identification approaches exist, these require detailed domain knowledge and often also rely on outdated assumptions about how congestion cont…
▽ More
Transport protocols use congestion control to avoid overloading a network. Nowadays, different congestion control variants exist that influence performance. Studying their use is thus relevant, but it is hard to identify which variant is used. While passive identification approaches exist, these require detailed domain knowledge and often also rely on outdated assumptions about how congestion control operates and what data is accessible. We present DeePCCI, a passive, deep learning-based congestion control identification approach which does not need any domain knowledge other than training traffic of a congestion control variant. By only using packet arrival data, it is also directly applicable to encrypted (transport header) traffic. DeePCCI is therefore more easily extendable and can also be used with QUIC.
△ Less
Submitted 4 July, 2019;
originally announced July 2019.
-
3D Protein Structure Predicted from Sequence
Authors:
Debora S. Marks,
Lucy J. Colwell,
Robert Sheridan,
Thomas A. Hopf,
Andrea Pagnani,
Riccardo Zecchina,
Chris Sander
Abstract:
The evolutionary trajectory of a protein through sequence space is constrained by function and three-dimensional (3D) structure. Residues in spatial proximity tend to co-evolve, yet attempts to invert the evolutionary record to identify these constraints and use them to computationally fold proteins have so far been unsuccessful. Here, we show that co-variation of residue pairs, observed in a larg…
▽ More
The evolutionary trajectory of a protein through sequence space is constrained by function and three-dimensional (3D) structure. Residues in spatial proximity tend to co-evolve, yet attempts to invert the evolutionary record to identify these constraints and use them to computationally fold proteins have so far been unsuccessful. Here, we show that co-variation of residue pairs, observed in a large protein family, provides sufficient information to determine 3D protein structure. Using a data-constrained maximum entropy model of the multiple sequence alignment, we identify pairs of statistically coupled residue positions which are expected to be close in the protein fold, termed contacts inferred from evolutionary information (EICs). To assess the amount of information about the protein fold contained in these coupled pairs, we evaluate the accuracy of predicted 3D structures for proteins of 50-260 residues, from 15 diverse protein families, including a G-protein coupled receptor. These structure predictions are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The resulting low Cα-RMSD error range of 2.7-5.1Å, over at least 75% of the protein, indicates the potential for predicting essentially correct 3D structures for the thousands of protein families that have no known structure, provided they include a sufficiently large number of divergent sample sequences. With the current enormous growth in sequence information based on new sequencing technology, this opens the door to a comprehensive survey of protein 3D structures, including many not currently accessible to the experimental methods of structural genomics. This advance has potential applications in many biological contexts, such as synthetic biology, identification of functional sites in proteins and interpretation of the functional impact of genetic variants.
△ Less
Submitted 25 October, 2011; v1 submitted 23 October, 2011;
originally announced October 2011.