-
An Overview of Analysis Methods and Evaluation Results for Caching Strategies
Authors:
Gerhard Hasslinger,
Mahshid Okhovatzadeh,
Konstantinos Ntougias,
Frank Hasslinger,
Oliver Hohlfeld
Abstract:
We survey analytical methods and evaluation results for the performance assessment of caching strategies. Knapsack solutions are derived, which provide static caching bounds for independent requests and general bounds for dynamic caching under arbitrary request pattern. We summarize Markov- and time-to-live-based solutions, which assume specific stochastic processes for capturing web request strea…
▽ More
We survey analytical methods and evaluation results for the performance assessment of caching strategies. Knapsack solutions are derived, which provide static caching bounds for independent requests and general bounds for dynamic caching under arbitrary request pattern. We summarize Markov- and time-to-live-based solutions, which assume specific stochastic processes for capturing web request streams and timing. We compare the performance of caching strategies with different knowledge about the properties of data objects regarding a broad set of caching demands. The efficiency of web caching must regard benefits for network wide traffic load, energy consumption and quality-of-service aspects in a tradeoff with costs for updating and storage overheads.
△ Less
Submitted 5 August, 2023;
originally announced August 2023.
-
Reviewing War: Unconventional User Reviews as a Side Channel to Circumvent Information Controls
Authors:
José Miguel Moreno,
Sergio Pastrana,
Jens Helge Reelfs,
Pelayo Vallina,
Andriy Panchenko,
Georgios Smaragdakis,
Oliver Hohlfeld,
Narseo Vallina-Rodriguez,
Juan Tapiador
Abstract:
During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user…
▽ More
During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user reviews of Russian business available on Google Maps or Tripadvisor. This paper provides a first analysis of this new phenomenon by analyzing the creative strategies to avoid state censorship. Specifically, we analyze reviews posted on these platforms from the beginning of the conflict to September 2022. We measure the channeling of war messages through user reviews in Tripadvisor and Google Maps, as well as in VK, a popular Russian social network. Our analysis of the content posted on these services reveals that users leveraged these platforms to seek and exchange humanitarian and travel advice, but also to disseminate disinformation and polarized messages. Finally, we analyze the response of platforms in terms of content moderation and their impact.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Reviewing Best Practices in Online Conferencing
Authors:
Simone Ferlin,
Oliver Hohlfeld,
Vaibhav Bajpai
Abstract:
The COVID-19 pandemic disrupted the usual ways the networking research community operates. This article reviews experiences organising and participating in virtual conferences during the COVID-19 pandemic between 2020-2021. Thanks to the broader scope of the Dagstuhl seminar on 'Climate Friendly Internet Research' held in July 2021, here we focus the discussion on state-of-the-art in technologies…
▽ More
The COVID-19 pandemic disrupted the usual ways the networking research community operates. This article reviews experiences organising and participating in virtual conferences during the COVID-19 pandemic between 2020-2021. Thanks to the broader scope of the Dagstuhl seminar on 'Climate Friendly Internet Research' held in July 2021, here we focus the discussion on state-of-the-art in technologies and practices applied in online events such as conferences, teaching, and other meetings and identify approaches that are successful as well as others that need improvement. We also present a set of best practices and recommendations for the community.
△ Less
Submitted 22 December, 2022;
originally announced December 2022.
-
Characterizing the country-wide adoption and evolution of the Jodel messaging app in Saudi Arabia
Authors:
Jens Helge Reelfs,
Oliver Hohlfeld,
Markus Strohmaier,
Niklas Henckell
Abstract:
Social media is subject to constant growth and evolution, yet little is known about their early phases of adoption. To shed light on this aspect, this paper empirically characterizes the initial and country-wide adoption of a new type of social media in Saudi Arabia that happened in 2017. Unlike established social media, the studied network Jodel is anonymous and location-based to form hundreds of…
▽ More
Social media is subject to constant growth and evolution, yet little is known about their early phases of adoption. To shed light on this aspect, this paper empirically characterizes the initial and country-wide adoption of a new type of social media in Saudi Arabia that happened in 2017. Unlike established social media, the studied network Jodel is anonymous and location-based to form hundreds of independent communities country-wide whose adoption pattern we compare. We take a detailed and full view from the operators perspective on the temporal and geographical dimension on the evolution of these different communities -- from their very first the first months of establishment to saturation. This way, we make the early adoption of a new type of social media visible, a process that is often invisible due to the lack of data covering the first days of a new network.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.
-
Anonymous Hyperlocal Communities: What do they talk about?
Authors:
Jens Helge Reelfs,
Oliver Hohlfeld,
Niklas Henckell
Abstract:
In this paper, we study what users talk about in a plethora of independent hyperlocal and anonymous online communities in a single country: Saudi Arabia (KSA). We base this perspective on performing a content classification of the Jodel network in the KSA. To do so, we first contribute a content classification schema that assesses both the intent (why) and the topic (what) of posts. We use the sch…
▽ More
In this paper, we study what users talk about in a plethora of independent hyperlocal and anonymous online communities in a single country: Saudi Arabia (KSA). We base this perspective on performing a content classification of the Jodel network in the KSA. To do so, we first contribute a content classification schema that assesses both the intent (why) and the topic (what) of posts. We use the schema to label 15k randomly sampled posts and further classify the top 1k hashtags. We observe a rich set of benign (yet at times controversial in conservative regimes) intents and topics that dominantly address information requests, entertainment, or dating/flirting. By comparing two large cities (Riyadh and Jeddah), we further show that hyperlocality leads to shifts in topic popularity between local communities. By evaluating votes (content appreciation) and replies (reactions), we show that the communities react differently to different topics; e.g., entertaining posts are much appreciated through votes, receiving the least replies, while beliefs & politics receive similarly few replies but are controversially voted.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Differences in Social Media Usage Exist Between Western and Middle-East Countries
Authors:
Jens Helge Reelfs,
Oliver Hohlfeld,
Niklas Henckell
Abstract:
In this paper, we empirically analyze two examples of a Western (DE) versus Middle-East (SA) Online Social Messaging App. By focusing on the system interactions over time in comparison, we identify inherent differences in user engagement. We take a deep dive and shed light onto differences in user attention shifts and showcase their structural implications to the user experience. Our main findings…
▽ More
In this paper, we empirically analyze two examples of a Western (DE) versus Middle-East (SA) Online Social Messaging App. By focusing on the system interactions over time in comparison, we identify inherent differences in user engagement. We take a deep dive and shed light onto differences in user attention shifts and showcase their structural implications to the user experience. Our main findings show that in comparison to the German counterparts, the Saudi communities prefer creating content in longer conversations, while voting more conservative.
△ Less
Submitted 30 January, 2022;
originally announced January 2022.
-
CyberBunker 2.0 -- A Domain and Traffic Perspective on a Bulletproof Hoster
Authors:
Daniel Kopp,
Eric Strehle,
Oliver Hohlfeld
Abstract:
In September 2019, 600 armed German cops seized the physical premise of a Bulletproof Hoster (BPH) referred to as CyberBunker 2.0. The hoster resided in a decommissioned NATO bunker and advertised to host everything but child porn and anything related to terrorism while kee** servers online no matter what. While the anatomy, economics and interconnection-level characteristics of BPHs are studied…
▽ More
In September 2019, 600 armed German cops seized the physical premise of a Bulletproof Hoster (BPH) referred to as CyberBunker 2.0. The hoster resided in a decommissioned NATO bunker and advertised to host everything but child porn and anything related to terrorism while kee** servers online no matter what. While the anatomy, economics and interconnection-level characteristics of BPHs are studied, their traffic characteristics are unknown. In this poster, we present the first analysis of domains, web pages, and traffic captured at a major tier-1 ISP and a large IXP at the time when the CyberBunker was in operation. Our study sheds light on traffic characteristics of a BPH in operation. We show that a traditional BGP-based BPH identification approach cannot detect the CyberBunker, but find characteristics from a domain and traffic perspective that can add to future identification approaches.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
DDoS Never Dies? An IXP Perspective on DDoS Amplification Attacks
Authors:
Daniel Kopp,
Christoph Dietzel,
Oliver Hohlfeld
Abstract:
DDoS attacks remain a major security threat to the continuous operation of Internet edge infrastructures, web services, and cloud platforms. While a large body of research focuses on DDoS detection and protection, to date we ultimately failed to eradicate DDoS altogether. Yet, the landscape of DDoS attack mechanisms is even evolving, demanding an updated perspective on DDoS attacks in the wild. In…
▽ More
DDoS attacks remain a major security threat to the continuous operation of Internet edge infrastructures, web services, and cloud platforms. While a large body of research focuses on DDoS detection and protection, to date we ultimately failed to eradicate DDoS altogether. Yet, the landscape of DDoS attack mechanisms is even evolving, demanding an updated perspective on DDoS attacks in the wild. In this paper, we identify up to 2608 DDoS amplification attacks at a single day by analyzing multiple Tbps of traffic flows at a major IXP with a rich ecosystem of different networks. We observe the prevalence of well-known amplification attack protocols (e.g., NTP, CLDAP), which should no longer exist given the established mitigation strategies. Nevertheless, they pose the largest fraction on DDoS amplification attacks within our observation and we witness the emergence of DDoS attacks using recently discovered amplification protocols (e.g., OpenVPN, ARMS, Ubiquity Discovery Protocol). By analyzing the impact of DDoS on core Internet infrastructure, we show that DDoS can overload backbone-capacity and that filtering approaches in prior work omit 97% of the attack traffic.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Understanding & Predicting User Lifetime with Machine Learning in an Anonymous Location-Based Social Network
Authors:
Jens Helge Reelfs,
Max Bergmann,
Oliver Hohlfeld,
Niklas Henckell
Abstract:
In this work, we predict the user lifetime within the anonymous and location-based social network Jodel in the Kingdom of Saudi Arabia. Jodel's location-based nature yields to the establishment of disjoint communities country-wide and enables for the first time the study of user lifetime in the case of a large set of disjoint communities. A user's lifetime is an important measurement for evaluatin…
▽ More
In this work, we predict the user lifetime within the anonymous and location-based social network Jodel in the Kingdom of Saudi Arabia. Jodel's location-based nature yields to the establishment of disjoint communities country-wide and enables for the first time the study of user lifetime in the case of a large set of disjoint communities. A user's lifetime is an important measurement for evaluating and steering customer bases as it can be leveraged to predict churn and possibly apply suitable methods to circumvent potential user losses. We train and test off the shelf machine learning techniques with 5-fold crossvalidation to predict user lifetime as a regression and classification problem; identifying the Random Forest to provide very strong results. Discussing model complexity and quality trade-offs, we also dive deep into a time-dependent feature subset analysis, which does not work very well; Easing up the classification problem into a binary decision (lifetime longer than timespan $x$) enables a practical lifetime predictor with very good performance. We identify implicit similarities across community models according to strong correlations in feature importance. A single countrywide model generalizes the problem and works equally well for any tested community; the overall model internally works similar to others also indicated by its feature importances.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
The Boon and Bane of Cross-Signing: Shedding Light on a Common Practice in Public Key Infrastructures
Authors:
Jens Hiller,
Johanna Amann,
Oliver Hohlfeld
Abstract:
Public Key Infrastructures (PKIs) with their trusted Certificate Authorities (CAs) provide the trust backbone for the Internet: CAs sign certificates which prove the identity of servers, applications, or users. To be trusted by operating systems and browsers, a CA has to undergo lengthy and costly validation processes. Alternatively, trusted CAs can cross-sign other CAs to extend their trust to th…
▽ More
Public Key Infrastructures (PKIs) with their trusted Certificate Authorities (CAs) provide the trust backbone for the Internet: CAs sign certificates which prove the identity of servers, applications, or users. To be trusted by operating systems and browsers, a CA has to undergo lengthy and costly validation processes. Alternatively, trusted CAs can cross-sign other CAs to extend their trust to them. In this paper, we systematically analyze the present and past state of cross-signing in the Web PKI. Our dataset (derived from passive TLS monitors and public CT logs) encompasses more than 7 years and 225 million certificates with 9.3 billion trust paths. We show benefits and risks of cross-signing. We discuss the difficulty of revoking trusted CA certificates where, worrisome, cross-signing can result in valid trust paths to remain after revocation; a problem for non-browser software that often blindly trusts all CA certificates and ignores revocations. However, cross-signing also enables fast bootstrap** of new CAs, e.g., Let's Encrypt, and achieves a non-disruptive user experience by providing backward compatibility. In this paper, we propose new rules and guidance for cross-signing to preserve its positive potential while mitigating its risks.
△ Less
Submitted 18 September, 2020;
originally announced September 2020.
-
The Lockdown Effect: Implications of the COVID-19 Pandemic on Internet Traffic
Authors:
Anja Feldmann,
Oliver Gasser,
Franziska Lichtblau,
Enric Pujol,
Ingmar Poese,
Christoph Dietzel,
Daniel Wagner,
Matthias Wichtlhuber,
Juan Tapiador,
Narseo Vallina-Rodriguez,
Oliver Hohlfeld,
Georgios Smaragdakis
Abstract:
Due to the COVID-19 pandemic, many governments imposed lock downs that forced hundreds of millions of citizens to stay at home. The implementation of confinement measures increased Internet traffic demands of residential users, in particular, for remote working, entertainment, commerce, and education, which, as a result, caused traffic shifts in the Internet core. In this paper, using data from a…
▽ More
Due to the COVID-19 pandemic, many governments imposed lock downs that forced hundreds of millions of citizens to stay at home. The implementation of confinement measures increased Internet traffic demands of residential users, in particular, for remote working, entertainment, commerce, and education, which, as a result, caused traffic shifts in the Internet core. In this paper, using data from a diverse set of vantage points (one ISP, three IXPs, and one metropolitan educational network), we examine the effect of these lockdowns on traffic shifts. We find that the traffic volume increased by 15-20% almost within a week--while overall still modest, this constitutes a large increase within this short time period. However, despite this surge, we observe that the Internet infrastructure is able to handle the new volume, as most traffic shifts occur outside of traditional peak hours. When looking directly at the traffic sources, it turns out that, while hypergiants still contribute a significant fraction of traffic, we see (1) a higher increase in traffic of non-hypergiants, and (2) traffic increases in applications that people use when at home, such as Web conferencing, VPN, and gaming. While many networks see increased traffic demands, in particular, those providing services to residential users, academic networks experience major overall decreases. Yet, in these networks, we can observe substantial increases when considering applications associated to remote working and lecturing.
△ Less
Submitted 5 October, 2020; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Corona-Warn-App: Tracing the Start of the Official COVID-19 Exposure Notification App for Germany
Authors:
Jens Helge Reelfs,
Oliver Hohlfeld,
Ingmar Poese
Abstract:
On June 16, 2020, Germany launched an open-source smartphone contact tracing app ("Corona-Warn-App") to help tracing SARS-CoV-2 (coronavirus) infection chains. It uses a decentralized, privacy-preserving design based on the Exposure Notification APIs in which a centralized server is only used to distribute a list of keys of SARS-CoV-2 infected users that is fetched by the app once per day. Its suc…
▽ More
On June 16, 2020, Germany launched an open-source smartphone contact tracing app ("Corona-Warn-App") to help tracing SARS-CoV-2 (coronavirus) infection chains. It uses a decentralized, privacy-preserving design based on the Exposure Notification APIs in which a centralized server is only used to distribute a list of keys of SARS-CoV-2 infected users that is fetched by the app once per day. Its success, however, depends on its adoption. In this poster, we characterize the early adoption of the app using Netflow traces captured directly at its hosting infrastructure. We show that the app generated traffic from allover Germany---already on the first day. We further observe that local COVID-19 outbreaks do not result in noticeable traffic increases.
△ Less
Submitted 25 July, 2020;
originally announced August 2020.
-
Word-Emoji Embeddings from large scale Messaging Data reflect real-world Semantic Associations of Expressive Icons
Authors:
Jens Helge Reelfs,
Oliver Hohlfeld,
Markus Strohmaier,
Niklas Henckell
Abstract:
We train word-emoji embeddings on large scale messaging data obtained from the Jodel online social network. Our data set contains more than 40 million sentences, of which 11 million sentences are annotated with a subset of the Unicode 13.0 standard Emoji list. We explore semantic emoji associations contained in this embedding by analyzing associations between emojis, between emojis and text, and b…
▽ More
We train word-emoji embeddings on large scale messaging data obtained from the Jodel online social network. Our data set contains more than 40 million sentences, of which 11 million sentences are annotated with a subset of the Unicode 13.0 standard Emoji list. We explore semantic emoji associations contained in this embedding by analyzing associations between emojis, between emojis and text, and between text and emojis. Our investigations demonstrate anecdotally that word-emoji embeddings trained on large scale messaging data can reflect real-world semantic associations. To enable further research we release the Jodel Emoji Embedding Dataset (JEED1488) containing 1488 emojis and their embeddings along 300 dimensions.
△ Less
Submitted 19 May, 2020;
originally announced June 2020.
-
Multi-episodic Perceived Quality of an Audio-on-Demand Service
Authors:
Dennis Guse,
Oliver Hohlfeld,
Anna Wunderlich,
Benjamin Weiss,
Sebastian Möller
Abstract:
QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underly…
▽ More
QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underlying quality formation processes and its factors are still to be discovered. We present a multi-episodic experiment of an Audio on Demand service over a usage period of 6~days with 93 participants. Our work directly extends prior work investigating the impact of time between usage episodes. The results show similar effects---also the recency effect is not statistically significant. In addition, we extend prediction of multi-episodic judgments by accounting for the observed saturation.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
MUST, SHOULD, DON'T CARE: TCP Conformance in the Wild
Authors:
Mike Kosek,
Leo Blöcher,
Jan Rüth,
Torsten Zimmermann,
Oliver Hohlfeld
Abstract:
Standards govern the SHOULD and MUST requirements for protocol implementers for interoperability. In case of TCP that carries the bulk of the Internets' traffic, these requirements are defined in RFCs. While it is known that not all optional features are implemented and nonconformance exists, one would assume that TCP implementations at least conform to the minimum set of MUST requirements. In thi…
▽ More
Standards govern the SHOULD and MUST requirements for protocol implementers for interoperability. In case of TCP that carries the bulk of the Internets' traffic, these requirements are defined in RFCs. While it is known that not all optional features are implemented and nonconformance exists, one would assume that TCP implementations at least conform to the minimum set of MUST requirements. In this paper, we use Internet-wide scans to show how Internet hosts and paths conform to these basic requirements. We uncover a non-negligible set of hosts and paths that do not adhere to even basic requirements. For example, we observe hosts that do not correctly handle checksums and cases of middlebox interference for TCP options. We identify hosts that drop packets when the urgent pointer is set or simply crash. Our publicly available results highlight that conformance to even fundamental protocol requirements should not be taken for granted but instead checked regularly.
△ Less
Submitted 19 March, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.
-
Perceiving QUIC: Do Users Notice or Even Care?
Authors:
Jan Rüth,
Konrad Wolsing,
Klaus Wehrle,
Oliver Hohlfeld
Abstract:
QUIC, as the foundation for HTTP/3, is becoming an Internet reality. A plethora of studies already show that QUIC excels beyond TCP+TLS+HTTP/2. Yet, these studies compare a highly optimized QUIC Web stack against an unoptimized TCP-based stack. In this paper, we bring TCP up to speed to perform an eye-level comparison. Instead of relying on technical metrics, we perform two extensive user studies…
▽ More
QUIC, as the foundation for HTTP/3, is becoming an Internet reality. A plethora of studies already show that QUIC excels beyond TCP+TLS+HTTP/2. Yet, these studies compare a highly optimized QUIC Web stack against an unoptimized TCP-based stack. In this paper, we bring TCP up to speed to perform an eye-level comparison. Instead of relying on technical metrics, we perform two extensive user studies to investigate QUIC's impact on the quality of experience. First, we investigate if users can distinguish two protocol versions in a direct comparison, and we find that QUIC is indeed rated faster than TCP and even a tuned TCP. Yet, our second study shows that this perceived performance increase does mostly not matter to the users, and they rate QUIC and TCP indistinguishable.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
DDoS Hide & Seek: On the Effectiveness of a Booter Services Takedown
Authors:
Daniel Kopp,
Matthias Wichtlhuber,
Ingmar Poese,
Jair Santanna,
Oliver Hohlfeld,
Christoph Dietzel
Abstract:
Booter services continue to provide popular DDoS-as-a-service platforms and enable anyone irrespective of their technical ability, to execute DDoS attacks with devastating impact. Since booters are a serious threat to Internet operations and can cause significant financial and reputational damage, they also draw the attention of law enforcement agencies and related counter activities. In this pape…
▽ More
Booter services continue to provide popular DDoS-as-a-service platforms and enable anyone irrespective of their technical ability, to execute DDoS attacks with devastating impact. Since booters are a serious threat to Internet operations and can cause significant financial and reputational damage, they also draw the attention of law enforcement agencies and related counter activities. In this paper, we investigate booter-based DDoS attacks in the wild and the impact of an FBI takedown targeting 15 booter websites in December 2018 from the perspective of a major IXP and two ISPs. We study and compare attack properties of multiple booter services by launching Gbps-level attacks against our own infrastructure. To understand spatial and temporal trends of the DDoS traffic originating from booters we scrutinize 5 months, worth of inter-domain traffic. We observe that the takedown only leads to a temporary reduction in attack traffic. Additionally, one booter was found to quickly continue operation by using a new domain for its website.
△ Less
Submitted 16 September, 2019;
originally announced September 2019.
-
DeePCCI: Deep Learning-based Passive Congestion Control Identification
Authors:
Constantin Sander,
Jan Rüth,
Oliver Hohlfeld,
Klaus Wehrle
Abstract:
Transport protocols use congestion control to avoid overloading a network. Nowadays, different congestion control variants exist that influence performance. Studying their use is thus relevant, but it is hard to identify which variant is used. While passive identification approaches exist, these require detailed domain knowledge and often also rely on outdated assumptions about how congestion cont…
▽ More
Transport protocols use congestion control to avoid overloading a network. Nowadays, different congestion control variants exist that influence performance. Studying their use is thus relevant, but it is hard to identify which variant is used. While passive identification approaches exist, these require detailed domain knowledge and often also rely on outdated assumptions about how congestion control operates and what data is accessible. We present DeePCCI, a passive, deep learning-based congestion control identification approach which does not need any domain knowledge other than training traffic of a congestion control variant. By only using packet arrival data, it is also directly applicable to encrypted (transport header) traffic. DeePCCI is therefore more easily extendable and can also be used with QUIC.
△ Less
Submitted 4 July, 2019;
originally announced July 2019.
-
A Performance Perspective on Web Optimized Protocol Stacks: TCP+TLS+HTTP/2 vs. QUIC
Authors:
Konrad Wolsing,
Jan Rüth,
Klaus Wehrle,
Oliver Hohlfeld
Abstract:
Existing performance comparisons of QUIC and TCP compared an optimized QUIC to an unoptimized TCP stack. By neglecting available TCP improvements inherently included in QUIC, comparisons do not shed light on the performance of current web stacks. In this paper, we can show that tuning TCP parameters is not negligible and directly yields significant improvements. Nevertheless, QUIC still outperform…
▽ More
Existing performance comparisons of QUIC and TCP compared an optimized QUIC to an unoptimized TCP stack. By neglecting available TCP improvements inherently included in QUIC, comparisons do not shed light on the performance of current web stacks. In this paper, we can show that tuning TCP parameters is not negligible and directly yields significant improvements. Nevertheless, QUIC still outperforms even our tuned variant of TCP. This performance advantage is mostly caused by QUIC's reduced RTT design during connection establishment, and, in case of lossy networks due to its ability to circumvent head-of-line blocking.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
An Empirical View on Content Provider Fairness
Authors:
Jan Rüth,
Ike Kunze,
Oliver Hohlfeld
Abstract:
Congestion control is an indispensable component of transport protocols to prevent congestion collapse. As such, it distributes the available bandwidth among all competing flows, ideally in a fair manner. However, there exists a constantly evolving set of congestion control algorithms, each addressing different performance needs and providing the potential for custom parametrizations. In particula…
▽ More
Congestion control is an indispensable component of transport protocols to prevent congestion collapse. As such, it distributes the available bandwidth among all competing flows, ideally in a fair manner. However, there exists a constantly evolving set of congestion control algorithms, each addressing different performance needs and providing the potential for custom parametrizations. In particular, content providers such as CDNs are known to tune TCP stacks for performance gains. In this paper, we thus empirically investigate if current Internet traffic generated by content providers still adheres to the conventional understanding of fairness. Our study compares fairness properties of testbed hosts to actual traffic of six major content providers subject to different bandwidths, RTTs, queue sizes, and queueing disciplines in a home-user setting. We find that some employed congestion control algorithms lead to significantly asymmetric bandwidth shares, however, AQMs such as FQ_CoDel are able to alleviate such unfairness.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.
-
Blitz-starting QUIC Connections
Authors:
Jan Rüth,
Konrad Wolsing,
Martin Serror,
Klaus Wehrle,
Oliver Hohlfeld
Abstract:
In this paper, we revisit the idea to remove Slow Start from congestion control. To do so, we build upon the newly gained freedom of transport protocol extendability offered by QUIC to hint bandwidth estimates from a typical web client to a server. Using this bandwidth estimate, we bootstrap congestion windows of new connections to quickly utilize available bandwidth. This custom flow initializati…
▽ More
In this paper, we revisit the idea to remove Slow Start from congestion control. To do so, we build upon the newly gained freedom of transport protocol extendability offered by QUIC to hint bandwidth estimates from a typical web client to a server. Using this bandwidth estimate, we bootstrap congestion windows of new connections to quickly utilize available bandwidth. This custom flow initialization removes the common early exit of Slow Start and thus fuels short flow fairness with long-running connections. Our results indicate that we can drastically reduce flow completion time accepting some losses and thereby an inflated transmission volume. For example, for a typical DSL client, loading a 2 MB YouTube video chunk is accelerated by nearly 2x. In the worst case, we find an inflation of the transfer volume by 12% due to losses.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.
-
TheFragebogen: A Web Browser-based Questionnaire Framework for Scientific Research
Authors:
Dennis Guse,
Henrique R. Orefice,
Gabriel Reimers,
Oliver Hohlfeld
Abstract:
Quality of Experience (QoE) typically involves conducting experiments in which stimuli are presented to participants and their judgments as well as behavioral data are collected. Nowadays, many experiments require software for the presentation of stimuli and the data collection from participants. While different software solutions exist, these are not tailored to conduct experiments on QoE. Moreov…
▽ More
Quality of Experience (QoE) typically involves conducting experiments in which stimuli are presented to participants and their judgments as well as behavioral data are collected. Nowadays, many experiments require software for the presentation of stimuli and the data collection from participants. While different software solutions exist, these are not tailored to conduct experiments on QoE. Moreover, replicating experiments or repeating the same experiment in different settings (e. g., laboratory vs. crowdsourcing) can further increase the software complexity. TheFragebogen is an open-source, versatile, extendable software framework for the implementation of questionnaires - especially for research on QoE. Implemented questionnaires can be presented with a state-of-the-art web browser to support a broad range of devices while the use of a web server being optional. Out-of-the-box, TheFragebogen provides graphical exact scales as well as free-hand input, the ability to collect behavioral data, and playback multimedia content.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
Application-Agnostic Offloading of Packet Processing
Authors:
Oliver Hohlfeld,
Helge Reelfs,
Jan Rüth,
Florian Schmidt,
Torsten Zimmermann,
Jens Hiller,
Klaus Wehrle
Abstract:
As network speed increases, servers struggle to serve all requests directed at them. This challenge is rooted in a partitioned data path where the split between the kernel space networking stack and user space applications induces overheads. To address this challenge, we propose Santa, a new architecture to optimize the data path by enabling server applications to partially offload packet processi…
▽ More
As network speed increases, servers struggle to serve all requests directed at them. This challenge is rooted in a partitioned data path where the split between the kernel space networking stack and user space applications induces overheads. To address this challenge, we propose Santa, a new architecture to optimize the data path by enabling server applications to partially offload packet processing to a generic rule processor. We exemplify Santa by showing how it can drastically accelerate kernel-based packet processing - a currently neglected domain. Our evaluation of a broad class of applications, namely DNS, Memcached, and HTTP, highlights that Santa can substantially improve the server performance by a factor of 5.5, 2.1, and 2.5, respectively.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Hashtag Usage in a Geographically-Local Microblogging App
Authors:
Helge Reelfs,
Timon Mohaupt,
Oliver Hohlfeld,
Niklas Henckell
Abstract:
This paper studies for the first time the usage and propagation of hashtags in a new and fundamentally different type of social media that is i) without profiles and ii) location-based to only show nearby posted content. Our study is based on analyzing the mobile-only Jodel microblogging app, which has an established user base in several European countries and Saudi Arabia. All posts are user to u…
▽ More
This paper studies for the first time the usage and propagation of hashtags in a new and fundamentally different type of social media that is i) without profiles and ii) location-based to only show nearby posted content. Our study is based on analyzing the mobile-only Jodel microblogging app, which has an established user base in several European countries and Saudi Arabia. All posts are user to user anonymous (i.e., no displayed user handles) and are only displayed in the proximity of the user's location (up to 20 km). It thereby forms local communities and opens the question of how information propagates within and between these communities. We tackle this question by applying established metrics for Twitter hashtags to a ground-truth data set of Jodel posts within Germany that spans three years. We find the usage of hashtags in Jodel to differ from Twitter; despite embracing local communication in its design, Jodel hashtags are mostly used country-wide.
△ Less
Submitted 11 March, 2019;
originally announced March 2019.
-
Demystifying TCP Initial Window Configurations of Content Distribution Networks
Authors:
Jan Rüth,
Oliver Hohlfeld
Abstract:
Driven by their quest to improve web performance, Content Delivery Networks (CDNs) are known adaptors of performance optimizations. In this regard, TCP congestion control and particularly its initial congestion window (IW) size is one long-debated topic that can influence CDN performance. Its size is, however, assumed to be static by IETF recommendations---despite being network- and application-de…
▽ More
Driven by their quest to improve web performance, Content Delivery Networks (CDNs) are known adaptors of performance optimizations. In this regard, TCP congestion control and particularly its initial congestion window (IW) size is one long-debated topic that can influence CDN performance. Its size is, however, assumed to be static by IETF recommendations---despite being network- and application-dependent---and only infrequently changed in its history. To understand if the standardization and research perspective still meets Internet reality, we study the IW configurations of major CDNs. Our study uses a globally distributed infrastructure of VPNs giving access to residential access links that enable to shed light on network-dependent configurations. We observe that most CDNs are well aware of the IW's impact and find a high amount of customization that is beyond current Internet standards. Further, we find CDNs that utilize different IWs for different customers and content while others resort to fixed values. We find various initial window configurations, most below 50 segments yet with exceptions of up to 100 segments---the tenfold of current standards. Our study highlights that Internet reality drifted away from recommended and standardized practices.
△ Less
Submitted 24 February, 2019;
originally announced February 2019.
-
Hidden Treasures - Recycling Large-Scale Internet Measurements to Study the Internet's Control Plane
Authors:
Jan Rüth,
Torsten Zimmermann,
Oliver Hohlfeld
Abstract:
Internet-wide scans are a common active measurement approach to study the Internet, e.g., studying security properties or protocol adoption. They involve probing large address ranges (IPv4 or parts of IPv6) for specific ports or protocols. Besides their primary use for probing (e.g., studying protocol adoption), we show that - at the same time - they provide valuable insights into the Internet con…
▽ More
Internet-wide scans are a common active measurement approach to study the Internet, e.g., studying security properties or protocol adoption. They involve probing large address ranges (IPv4 or parts of IPv6) for specific ports or protocols. Besides their primary use for probing (e.g., studying protocol adoption), we show that - at the same time - they provide valuable insights into the Internet control plane informed by ICMP responses to these probes - a currently unexplored secondary use. We collect one week of ICMP responses (637.50M messages) to several Internet-wide ZMap scans covering multiple TCP and UDP ports as well as DNS-based scans covering > 50% of the domain name space. This perspective enables us to study the Internet's control plane as a by-product of Internet measurements. We receive ICMP messages from ~171M different IPs in roughly 53K different autonomous systems. Additionally, we uncover multiple control plane problems, e.g., we detect a plethora of outdated and misconfigured routers and uncover the presence of large-scale persistent routing loops in IPv4.
△ Less
Submitted 22 January, 2019;
originally announced January 2019.
-
Is the Web ready for HTTP/2 Server Push?
Authors:
Torsten Zimmermann,
Benedikt Wolters,
Oliver Hohlfeld,
Klaus Wehrle
Abstract:
HTTP/2 supersedes HTTP/1.1 to tackle the performance challenges of the modern Web. A highly anticipated feature is Server Push, enabling servers to send data without explicit client requests, thus potentially saving time. Although guidelines on how to use Server Push emerged, measurements have shown that it can easily be used in a suboptimal way and hurt instead of improving performance. We thus t…
▽ More
HTTP/2 supersedes HTTP/1.1 to tackle the performance challenges of the modern Web. A highly anticipated feature is Server Push, enabling servers to send data without explicit client requests, thus potentially saving time. Although guidelines on how to use Server Push emerged, measurements have shown that it can easily be used in a suboptimal way and hurt instead of improving performance. We thus tackle the question if the current Web can make better use of Server Push. First, we enable real-world websites to be replayed in a testbed to study the effects of different Server Push strategies. Using this, we next revisit proposed guidelines to grasp their performance impact. Finally, based on our results, we propose a novel strategy using an alternative server scheduler that enables to interleave resources. This improves the visual progress for some websites, with minor modifications to the deployment. Still, our results highlight the limits of Server Push: a deep understanding of web engineering is required to make optimal use of it, and not every site will benefit.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
Dissecting Apple's Meta-CDN during an iOS Update
Authors:
Jeremias Blendin,
Fabrice Bendfeldt,
Ingmar Poese,
Boris Koldehofe,
Oliver Hohlfeld
Abstract:
Content delivery networks (CDN) contribute more than 50% of today's Internet traffic. Meta-CDNs, an evolution of centrally controlled CDNs, promise increased flexibility by multihoming content. So far, efforts to understand the characteristics of Meta-CDNs focus mainly on third-party Meta-CDN services. A common, but unexplored, use case for Meta-CDNs is to use the CDNs map** infrastructure to fo…
▽ More
Content delivery networks (CDN) contribute more than 50% of today's Internet traffic. Meta-CDNs, an evolution of centrally controlled CDNs, promise increased flexibility by multihoming content. So far, efforts to understand the characteristics of Meta-CDNs focus mainly on third-party Meta-CDN services. A common, but unexplored, use case for Meta-CDNs is to use the CDNs map** infrastructure to form self-operated Meta-CDNs integrating third-party CDNs. These CDNs assist in the build-up phase of a CDN's infrastructure or mitigate capacity shortages by offloading traffic. This paper investigates the Apple CDN as a prominent example of self-operated Meta-CDNs. We describe the involved CDNs, the request-map** mechanism, and show the cache locations of the Apple CDN using measurements of more than 800 RIPE Atlas probes worldwide. We further measure its load-sharing behavior by observing a major iOS update in Sep. 2017, a significant event potentially reaching up to an estimated 1 billion iOS devices. Furthermore, by analyzing data from a European Eyeball ISP, we quantify third-party traffic offloading effects and find third-party CDNs increase their traffic by 438% while saturating seemingly unrelated links.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
Digging into Browser-based Crypto Mining
Authors:
Jan Rüth,
Torsten Zimmermann,
Konrad Wolsing,
Oliver Hohlfeld
Abstract:
Mining is the foundation of blockchain-based cryptocurrencies such as Bitcoin rewarding the miner for finding blocks for new transactions. The Monero currency enables mining with standard hardware in contrast to special hardware (ASICs) as often used in Bitcoin, paving the way for in-browser mining as a new revenue model for website operators. In this work, we study the prevalence of this new phen…
▽ More
Mining is the foundation of blockchain-based cryptocurrencies such as Bitcoin rewarding the miner for finding blocks for new transactions. The Monero currency enables mining with standard hardware in contrast to special hardware (ASICs) as often used in Bitcoin, paving the way for in-browser mining as a new revenue model for website operators. In this work, we study the prevalence of this new phenomenon. We identify and classify mining websites in 138M domains and present a new fingerprinting method which finds up to a factor of 5.7 more miners than publicly available block lists. Our work identifies and dissects Coinhive as the major browser-mining stakeholder. Further, we present a new method to associate mined blocks in the Monero blockchain to mining pools and uncover that Coinhive currently contributes 1.18% of mined blocks having turned over 1293 Moneros in June 2018.
△ Less
Submitted 21 September, 2018; v1 submitted 2 August, 2018;
originally announced August 2018.
-
A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists
Authors:
Quirin Scheitle,
Oliver Hohlfeld,
Julien Gamba,
Jonas Jelten,
Torsten Zimmermann,
Stephen D. Strowes,
Narseo Vallina-Rodriguez
Abstract:
A broad range of research areas including Internet measurement, privacy, and network security rely on lists of target domains to be analysed; researchers make use of target lists for reasons of necessity or efficiency. The popular Alexa list of one million domains is a widely used example. Despite their prevalence in research papers, the soundness of top lists has seldom been questioned by the com…
▽ More
A broad range of research areas including Internet measurement, privacy, and network security rely on lists of target domains to be analysed; researchers make use of target lists for reasons of necessity or efficiency. The popular Alexa list of one million domains is a widely used example. Despite their prevalence in research papers, the soundness of top lists has seldom been questioned by the community: little is known about the lists' creation, representativity, potential biases, stability, or overlap between lists.
In this study we survey the extent, nature, and evolution of top lists used by research communities. We assess the structure and stability of these lists, and show that rank manipulation is possible for some lists. We also reproduce the results of several scientific studies to assess the impact of using a top list at all, which list specifically, and the date of list creation. We find that (i) top lists generally overestimate results compared to the general population by a significant margin, often even an order of magnitude, and (ii) some top lists have surprising change characteristics, causing high day-to-day fluctuation and leading to result instability. We conclude our paper with specific recommendations on the use of top lists, and how to interpret results based on top lists with caution.
△ Less
Submitted 23 September, 2018; v1 submitted 29 May, 2018;
originally announced May 2018.
-
Characterizing a Meta-CDN
Authors:
Oliver Hohlfeld,
Jan Rüth,
Konrad Wolsing,
Torsten Zimmermann
Abstract:
CDNs have reshaped the Internet architecture at large. They operate (globally) distributed networks of servers to reduce latencies as well as to increase availability for content and to handle large traffic bursts. Traditionally, content providers were mostly limited to a single CDN operator. However, in recent years, more and more content providers employ multiple CDNs to serve the same content a…
▽ More
CDNs have reshaped the Internet architecture at large. They operate (globally) distributed networks of servers to reduce latencies as well as to increase availability for content and to handle large traffic bursts. Traditionally, content providers were mostly limited to a single CDN operator. However, in recent years, more and more content providers employ multiple CDNs to serve the same content and provide the same services. Thus, switching between CDNs, which can be beneficial to reduce costs or to select CDNs by optimal performance in different geographic regions or to overcome CDN-specific outages, becomes an important task. Services that tackle this task emerged, also known as CDN broker, Multi-CDN selectors, or Meta-CDNs. Despite their existence, little is known about Meta-CDN operation in the wild. In this paper, we thus shed light on this topic by dissecting a major Meta-CDN. Our analysis provides insights into its infrastructure, its operation in practice, and its usage by Internet sites. We leverage PlanetLab and Ripe Atlas as distributed infrastructures to study how a Meta-CDN impacts the web latency.
△ Less
Submitted 24 February, 2019; v1 submitted 27 March, 2018;
originally announced March 2018.
-
Structure and Stability of Internet Top Lists
Authors:
Quirin Scheitle,
Jonas Jelten,
Oliver Hohlfeld,
Luca Ciprian,
Georg Carle
Abstract:
Active Internet measurement studies rely on a list of targets to be scanned. While probing the entire IPv4 address space is feasible for scans of limited complexity, more complex scans do not scale to measuring the full Internet. Thus, a sample of the Internet can be used instead, often in form of a "top list". The most widely used list is the Alexa Global Top1M list. Despite their prevalence, use…
▽ More
Active Internet measurement studies rely on a list of targets to be scanned. While probing the entire IPv4 address space is feasible for scans of limited complexity, more complex scans do not scale to measuring the full Internet. Thus, a sample of the Internet can be used instead, often in form of a "top list". The most widely used list is the Alexa Global Top1M list. Despite their prevalence, use of top lists is seldomly questioned. Little is known about their creation, representativity, potential biases, stability, or overlap between lists. As a result, potential consequences of applying top lists in research are not known. In this study, we aim to open the discussion on top lists by investigating the aptness of frequently used top lists for empirical Internet scans, including stability, correlation, and potential biases of such lists.
△ Less
Submitted 7 February, 2018;
originally announced February 2018.
-
A First Look at QUIC in the Wild
Authors:
Jan Rüth,
Ingmar Poese,
Christoph Dietzel,
Oliver Hohlfeld
Abstract:
For the first time since the establishment of TCP and UDP, the Internet transport layer is subject to a major change by the introduction of QUIC. Initiated by Google in 2012, QUIC provides a reliable, connection-oriented low-latency and fully encrypted transport. In this paper, we provide the first broad assessment of QUIC usage in the wild. We monitor the entire IPv4 address space since August 20…
▽ More
For the first time since the establishment of TCP and UDP, the Internet transport layer is subject to a major change by the introduction of QUIC. Initiated by Google in 2012, QUIC provides a reliable, connection-oriented low-latency and fully encrypted transport. In this paper, we provide the first broad assessment of QUIC usage in the wild. We monitor the entire IPv4 address space since August 2016 and about 46% of the DNS namespace to detected QUIC-capable infrastructures. Our scans show that the number of QUIC-capable IPs has more than tripled since then to over 617.59 K. We find around 161K domains hosted on QUIC-enabled infrastructure, but only 15K of them present valid certificates over QUIC. Second, we analyze one year of traffic traces provided by MAWI, one day of a major European tier-1 ISP and from a large IXP to understand the dominance of QUIC in the Internet traffic mix. We find QUIC to account for 2.6% to 9.1% of the current Internet traffic, depending on the vantage point. This share is dominated by Google pushing up to 42.1% of its traffic via QUIC.
△ Less
Submitted 24 February, 2019; v1 submitted 16 January, 2018;
originally announced January 2018.