Search | arXiv e-print repository

Pushing Alias Resolution to the Limit

Authors: Taha Albakour, Oliver Gasser, Georgios Smaragdakis

Abstract: In this paper, we show that utilizing multiple protocols offers a unique opportunity to improve IP alias resolution and dual-stack inference substantially. Our key observation is that prevalent protocols, e.g., SSH and BGP, reply to unsolicited requests with a set of values that can be combined to form a unique device identifier. More importantly, this is possible by just completing the TCP hand-s… ▽ More In this paper, we show that utilizing multiple protocols offers a unique opportunity to improve IP alias resolution and dual-stack inference substantially. Our key observation is that prevalent protocols, e.g., SSH and BGP, reply to unsolicited requests with a set of values that can be combined to form a unique device identifier. More importantly, this is possible by just completing the TCP hand-shake. Our empirical study shows that utilizing readily available scans and our active measurements can double the discovered IPv4 alias sets and more than 30x the dual-stack sets compared to the state-of-the-art techniques. We provide insights into our method's accuracy and performance compared to popular techniques. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.15612 [pdf, other]

Illuminating Router Vendor Diversity Within Providers and Along Network Paths

Authors: Taha Albakour, Oliver Gasser, Robert Beverly, Georgios Smaragdakis

Abstract: The Internet architecture has facilitated a multi-party, distributed, and heterogeneous physical infrastructure where routers from different vendors connect and inter-operate via IP. Such vendor heterogeneity can have important security and policy implications. For example, a security vulnerability may be specific to a particular vendor and implementation, and thus will have a disproportionate imp… ▽ More The Internet architecture has facilitated a multi-party, distributed, and heterogeneous physical infrastructure where routers from different vendors connect and inter-operate via IP. Such vendor heterogeneity can have important security and policy implications. For example, a security vulnerability may be specific to a particular vendor and implementation, and thus will have a disproportionate impact on particular networks and paths if exploited. From a policy perspective, governments are now explicitly banning particular vendors, or have threatened to do so. Despite these critical issues, the composition of router vendors across the Internet remains largely opaque. Remotely identifying router vendors is challenging due to their strict security posture, indistinguishability due to code sharing across vendors, and noise due to vendor mergers. We make progress in overcoming these challenges by develo** LFP, a tool that improves the coverage, accuracy, and efficiency of router fingerprinting as compared to the current state-of-the-art. We leverage LFP to characterize the degree of router vendor homogeneity within networks and the regional distribution of vendors. We then take a path-centric view and apply LFP to better understand the potential for correlated failures and fate-sharing. Finally, we perform a case study on inter- and intra-United States data paths to explore the feasibility to make vendor-based routing policy decisions, i.e., whether it is possible to avoid a particular vendor given the current infrastructure. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2302.00598 [pdf, other]

Reviewing War: Unconventional User Reviews as a Side Channel to Circumvent Information Controls

Authors: José Miguel Moreno, Sergio Pastrana, Jens Helge Reelfs, Pelayo Vallina, Andriy Panchenko, Georgios Smaragdakis, Oliver Hohlfeld, Narseo Vallina-Rodriguez, Juan Tapiador

Abstract: During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user… ▽ More During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user reviews of Russian business available on Google Maps or Tripadvisor. This paper provides a first analysis of this new phenomenon by analyzing the creative strategies to avoid state censorship. Specifically, we analyze reviews posted on these platforms from the beginning of the conflict to September 2022. We measure the channeling of war messages through user reviews in Tripadvisor and Google Maps, as well as in VK, a popular Russian social network. Our analysis of the content posted on these services reveals that users leveraged these platforms to seek and exchange humanitarian and travel advice, but also to disseminate disinformation and polarized messages. Finally, we analyze the response of platforms in terms of content moderation and their impact. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2209.09603 [pdf, other]

doi 10.1145/3517745.3561431

Deep Dive into the IoT Backend Ecosystem

Authors: Said Jawad Saidi, Srdjan Matic, Oliver Gasser, Georgios Smaragdakis, Anja Feldmann

Abstract: Internet of Things (IoT) devices are becoming increasingly ubiquitous, e.g., at home, in enterprise environments, and in production lines. To support the advanced functionalities of IoT devices, IoT vendors as well as service and cloud companies operate IoT backends -- the focus of this paper. We propose a methodology to identify and locate them by (a) compiling a list of domains used exclusively… ▽ More Internet of Things (IoT) devices are becoming increasingly ubiquitous, e.g., at home, in enterprise environments, and in production lines. To support the advanced functionalities of IoT devices, IoT vendors as well as service and cloud companies operate IoT backends -- the focus of this paper. We propose a methodology to identify and locate them by (a) compiling a list of domains used exclusively by major IoT backend providers and (b) then identifying their server IP addresses. We rely on multiple sources, including IoT backend provider documentation, passive DNS data, and active scanning. For analyzing IoT traffic patterns, we rely on passive network flows from a major European ISP. Our analysis focuses on the top IoT backends and unveils diverse operational strategies -- from operating their own infrastructure to utilizing the public cloud. We find that the majority of the top IoT backend providers are located in multiple locations and countries. Still, a handful are located only in one country, which could raise regulatory scrutiny as the client IoT devices are located in other regions. Indeed, our analysis shows that up to 35% of IoT traffic is exchanged with IoT backend servers located in other continents. We also find that at least six of the top IoT backends rely on other IoT backend providers. We also evaluate if cascading effects among the IoT backend providers are possible in the event of an outage, a misconfiguration, or an attack. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Journal ref: Proceedings of the 2022 ACM Internet Measurement Conference (IMC '22)

arXiv:2205.05028 [pdf, other]

A Tale of Two Markets: Investigating the Ransomware Payments Economy

Authors: Kris Oosthoek, Jack Cable, Georgios Smaragdakis

Abstract: Ransomware attacks are among the most severe cyber threats. They have made headlines in recent years by threatening the operation of governments, critical infrastructure, and corporations. Collecting and analyzing ransomware data is an important step towards understanding the spread of ransomware and designing effective defense and mitigation mechanisms. We report on our experience operating Ranso… ▽ More Ransomware attacks are among the most severe cyber threats. They have made headlines in recent years by threatening the operation of governments, critical infrastructure, and corporations. Collecting and analyzing ransomware data is an important step towards understanding the spread of ransomware and designing effective defense and mitigation mechanisms. We report on our experience operating Ransomwhere, an open crowdsourced ransomware payment tracker to collect information from victims of ransomware attacks. With Ransomwhere, we have gathered 13.5k ransom payments to more than 87 ransomware criminal actors with total payments of more than $101 million. Leveraging the transparent nature of Bitcoin, the cryptocurrency used for most ransomware payments, we characterize the evolving ransomware criminal structure and ransom laundering strategies. Our analysis shows that there are two parallel ransomware criminal markets: commodity ransomware and Ransomware as a Service (RaaS). We notice that there are striking differences between the two markets in the way that cryptocurrency resources are utilized, revenue per transaction, and ransom laundering efficiency. Although it is relatively easy to identify choke points in commodity ransomware payment activity, it is more difficult to do the same for RaaS. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2203.08946 [pdf, other]

One Bad Apple Can Spoil Your IPv6 Privacy

Authors: Said Jawad Saidi, Oliver Gasser, Georgios Smaragdakis

Abstract: IPv6 is being more and more adopted, in part to facilitate the millions of smart devices that have already been installed at home. Unfortunately, we find that the privacy of a substantial fraction of end-users is still at risk, despite the efforts by ISPs and electronic vendors to improve end-user security, e.g., by adopting prefix rotation and IPv6 privacy extensions. By analyzing passive data fr… ▽ More IPv6 is being more and more adopted, in part to facilitate the millions of smart devices that have already been installed at home. Unfortunately, we find that the privacy of a substantial fraction of end-users is still at risk, despite the efforts by ISPs and electronic vendors to improve end-user security, e.g., by adopting prefix rotation and IPv6 privacy extensions. By analyzing passive data from a large ISP, we find that around 19% of end-users' privacy can be at risk. When we investigate the root causes, we notice that a single device at home that encodes its MAC address into the IPv6 address can be utilized as a tracking identifier for the entire end-user prefix -- even if other devices use IPv6 privacy extensions. Our results show that IoT devices contribute the most to this privacy leakage and, to a lesser extent, personal computers and mobile devices. To our surprise, some of the most popular IoT manufacturers have not yet adopted privacy extensions that could otherwise mitigate this privacy risk. Finally, we show that third-party providers, e.g., hypergiants, can track up to 17% of subscriber lines in our study. △ Less

Submitted 16 March, 2022; originally announced March 2022.

Comments: Accepted at ACM SIGCOMM Computer Communication Review, to appear in the April 2022 issue

arXiv:2201.13086 [pdf, other]

doi 10.14722/ndss.2023.23112

Securing Federated Sensitive Topic Classification against Poisoning Attacks

Authors: Tianyue Chu, Alvaro Garcia-Recuero, Costas Iordanou, Georgios Smaragdakis, Nikolaos Laoutaris

Abstract: We present a Federated Learning (FL) based solution for building a distributed classifier capable of detecting URLs containing GDPR-sensitive content related to categories such as health, sexual preference, political beliefs, etc. Although such a classifier addresses the limitations of previous offline/centralised classifiers,it is still vulnerable to poisoning attacks from malicious users that ma… ▽ More We present a Federated Learning (FL) based solution for building a distributed classifier capable of detecting URLs containing GDPR-sensitive content related to categories such as health, sexual preference, political beliefs, etc. Although such a classifier addresses the limitations of previous offline/centralised classifiers,it is still vulnerable to poisoning attacks from malicious users that may attempt to reduce the accuracy for benign users by disseminating faulty model updates. To guard against this, we develop a robust aggregation scheme based on subjective logic and residual-based attack detection. Employing a combination of theoretical analysis, trace-driven simulation, as well as experimental validation with a prototype and real users, we show that our classifier can detect sensitive content with high accuracy, learn new labels fast, and remain robust in view of poisoning attacks from malicious users, as well as imperfect input from non-malicious ones. △ Less

Submitted 28 October, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

MSC Class: 68M25 ACM Class: I.2.11; K.4.1

Journal ref: Network and Distributed System Security (NDSS) Symposium 2023

arXiv:2110.03816 [pdf, other]

AS-Level BGP Community Usage Classification

Authors: Thomas Krenc, Robert Beverly, Georgios Smaragdakis

Abstract: BGP communities are a popular mechanism used by network operators for traffic engineering, blackholing, and to realize network policies and business strategies. In recent years, many research works have contributed to our understanding of how BGP communities are utilized, as well as how they can reveal secondary insights into real-world events such as outages and security attacks. However, one fun… ▽ More BGP communities are a popular mechanism used by network operators for traffic engineering, blackholing, and to realize network policies and business strategies. In recent years, many research works have contributed to our understanding of how BGP communities are utilized, as well as how they can reveal secondary insights into real-world events such as outages and security attacks. However, one fundamental question remains unanswered: "Which ASes tag announcements with BGP communities and which remove communities in the announcements they receive?" A grounded understanding of where BGP communities are added or removed can help better model and predict BGP-based actions in the Internet and characterize the strategies of network operators. In this paper we develop, validate, and share data from the first algorithm that can infer BGP community tagging and cleaning behavior at the AS-level. The algorithm is entirely passive and uses BGP update messages and snapshots, e.g. from public route collectors, as input. First, we quantify the correctness and accuracy of the algorithm in controlled experiments with simulated topologies. To validate in the wild, we announce prefixes with communities and confirm that more than 90% of the ASes that we classify behave as our algorithm predicts. Finally, we apply the algorithm to data from four sets of BGP collectors: RIPE, RouteViews, Isolario, and PCH. Tuned conservatively, our algorithm ascribes community tagging and cleaning behaviors to more than 13k ASes, the majority of which are large networks and providers. We make our algorithm and inferences available as a public resource to the BGP research community. △ Less

Submitted 7 October, 2021; originally announced October 2021.

arXiv:2109.15095 [pdf, other]

doi 10.1145/3487552.3487848

Third Time's Not a Charm: Exploiting SNMPv3 for Router Fingerprinting

Authors: Taha Albakour, Oliver Gasser, Robert Beverly, Georgios Smaragdakis

Abstract: In this paper, we show that adoption of the SNMPv3 network management protocol standard offers a unique -- but likely unintended -- opportunity for remotely fingerprinting network infrastructure in the wild. Specifically, by sending unsolicited and unauthenticated SNMPv3 requests, we obtain detailed information about the configuration and status of network devices including vendor, uptime, and the… ▽ More In this paper, we show that adoption of the SNMPv3 network management protocol standard offers a unique -- but likely unintended -- opportunity for remotely fingerprinting network infrastructure in the wild. Specifically, by sending unsolicited and unauthenticated SNMPv3 requests, we obtain detailed information about the configuration and status of network devices including vendor, uptime, and the number of restarts. More importantly, the reply contains a persistent and strong identifier that allows for lightweight Internet-scale alias resolution and dual-stack association. By launching active Internet-wide SNMPv3 scan campaigns, we show that our technique can fingerprint more than 4.6 million devices of which around 350k are network routers. Not only is our technique lightweight and accurate, it is complementary to existing alias resolution, dual-stack inference, and device fingerprinting approaches. Our analysis not only provides fresh insights into the router deployment strategies of network operators worldwide, but also highlights potential vulnerabilities of SNMPv3 as currently deployed. △ Less

Submitted 6 October, 2021; v1 submitted 30 September, 2021; originally announced September 2021.

Comments: Visit https://snmpv3.io for up-to-date SNMPv3 measurement results

Journal ref: Proceedings of the 2021 ACM Internet Measurement Conference (IMC '21)

arXiv:2010.13120 [pdf, other]

doi 10.1109/TNSM.2020.3034278

Exploring Network-Wide Flow Data with Flowyager

Authors: Said Jawad Saidi, Aniss Maghsoudlou, Damien Foucard, Georgios Smaragdakis, Ingmar Poese, Anja Feldmann

Abstract: Many network operations, ranging from attack investigation and mitigation to traffic management, require answering network-wide flow queries in seconds. Although flow records are collected at each router, using available traffic capture utilities, querying the resulting datasets from hundreds of routers across sites and over time, remains a significant challenge due to the sheer traffic volume and… ▽ More Many network operations, ranging from attack investigation and mitigation to traffic management, require answering network-wide flow queries in seconds. Although flow records are collected at each router, using available traffic capture utilities, querying the resulting datasets from hundreds of routers across sites and over time, remains a significant challenge due to the sheer traffic volume and distributed nature of flow records. In this paper, we investigate how to improve the response time for a priori unknown network-wide queries. We present Flowyager, a system that is built on top of existing traffic capture utilities. Flowyager generates and analyzes tree data structures, that we call Flowtrees, which are succinct summaries of the raw flow data available by capture utilities. Flowtrees are self-adjusted data structures that drastically reduce space and transfer requirements, by 75% to 95%, compared to raw flow records. Flowyager manages the storage and transfers of Flowtrees, supports Flowtree operators, and provides a structured query language for answering flow queries across sites and time periods. By deploying a Flowyager prototype at both a large Internet Exchange Point and a Tier-1 Internet Service Provider, we showcase its capabilities for networks with hundreds of router interfaces. Our results show that the query response time can be reduced by an order of magnitude when compared with alternative data analytics platforms. Thus, Flowyager enables interactive network-wide queries and offers unprecedented drill-down capabilities to, e.g., identify DDoS culprits, pinpoint the involved sites, and determine the length of the attack. △ Less

Submitted 27 October, 2020; v1 submitted 25 October, 2020; originally announced October 2020.

Comments: accepted at IEEE TNSM Journal DOI added

arXiv:2010.00745 [pdf, other]

Keep your Communities Clean: Exploring the Routing Message Impact of BGP Communities

Authors: Thomas Krenc, Robert Beverly, Georgios Smaragdakis

Abstract: BGP communities are widely used to tag prefix aggregates for policy, traffic engineering, and inter-AS signaling. Because individual ASes define their own community semantics, many ASes blindly propagate communities they do not recognize. Prior research has shown the potential security vulnerabilities when communities are not filtered. This work sheds light on a second unintended side-effect of co… ▽ More BGP communities are widely used to tag prefix aggregates for policy, traffic engineering, and inter-AS signaling. Because individual ASes define their own community semantics, many ASes blindly propagate communities they do not recognize. Prior research has shown the potential security vulnerabilities when communities are not filtered. This work sheds light on a second unintended side-effect of communities and permissive propagation: an increase in unnecessary BGP routing messages. Due to its transitive property, a change in the community attribute induces update messages throughout established routes, just updating communities. We ground our work by characterizing the handling of updates with communities, including when filtered, on multiple real-world BGP implementations in controlled laboratory experiments. We then examine 10 years of BGP messages observed in the wild at two route collector systems. In 2020, approximately 25% of all announcements modify the community attribute, but retain the AS path of the most recent announcement; an additional 25% update neither community nor AS path. Using predictable beacon prefixes, we demonstrate that communities lead to an increase in update messages both at the tagging AS and at neighboring ASes that neither add nor filter communities. This effect is prominent for geolocation communities during path exploration: on a single day, 63% of all unique community attributes are revealed exclusively due to global withdrawals. △ Less

Submitted 2 November, 2020; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2009.01880 [pdf, other]

A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild

Authors: Said Jawad Saidi, Anna Maria Mandalari, Roman Kolcun, Hamed Haddadi, Daniel J. Dubois, David Choffnes, Georgios Smaragdakis, Anja Feldmann

Abstract: Consumer Internet of Things (IoT) devices are extremely popular, providing users with rich and diverse functionalities, from voice assistants to home appliances. These functionalities often come with significant privacy and security risks, with notable recent large scale coordinated global attacks disrupting large service providers. Thus, an important first step to address these risks is to know w… ▽ More Consumer Internet of Things (IoT) devices are extremely popular, providing users with rich and diverse functionalities, from voice assistants to home appliances. These functionalities often come with significant privacy and security risks, with notable recent large scale coordinated global attacks disrupting large service providers. Thus, an important first step to address these risks is to know what IoT devices are where in a network. While some limited solutions exist, a key question is whether device discovery can be done by Internet service providers that only see sampled flow statistics. In particular, it is challenging for an ISP to efficiently and effectively track and trace activity from IoT devices deployed by its millions of subscribers --all with sampled network data. In this paper, we develop and evaluate a scalable methodology to accurately detect and monitor IoT devices at subscriber lines with limited, highly sampled data in-the-wild. Our findings indicate that millions of IoT devices are detectable and identifiable within hours, both at a major ISP as well as an IXP, using passive, sparsely sampled network flow headers. Our methodology is able to detect devices from more than 77% of the studied IoT manufacturers, including popular devices such as smart speakers. While our methodology is effective for providing network analytics, it also highlights significant privacy consequences. △ Less

Submitted 30 September, 2020; v1 submitted 3 September, 2020; originally announced September 2020.

Comments: Accepted at the ACM Internet Measurement Conference 2020 (IMC'20)

arXiv:2008.10959 [pdf, other]

doi 10.1145/3419394.3423658

The Lockdown Effect: Implications of the COVID-19 Pandemic on Internet Traffic

Authors: Anja Feldmann, Oliver Gasser, Franziska Lichtblau, Enric Pujol, Ingmar Poese, Christoph Dietzel, Daniel Wagner, Matthias Wichtlhuber, Juan Tapiador, Narseo Vallina-Rodriguez, Oliver Hohlfeld, Georgios Smaragdakis

Abstract: Due to the COVID-19 pandemic, many governments imposed lock downs that forced hundreds of millions of citizens to stay at home. The implementation of confinement measures increased Internet traffic demands of residential users, in particular, for remote working, entertainment, commerce, and education, which, as a result, caused traffic shifts in the Internet core. In this paper, using data from a… ▽ More Due to the COVID-19 pandemic, many governments imposed lock downs that forced hundreds of millions of citizens to stay at home. The implementation of confinement measures increased Internet traffic demands of residential users, in particular, for remote working, entertainment, commerce, and education, which, as a result, caused traffic shifts in the Internet core. In this paper, using data from a diverse set of vantage points (one ISP, three IXPs, and one metropolitan educational network), we examine the effect of these lockdowns on traffic shifts. We find that the traffic volume increased by 15-20% almost within a week--while overall still modest, this constitutes a large increase within this short time period. However, despite this surge, we observe that the Internet infrastructure is able to handle the new volume, as most traffic shifts occur outside of traditional peak hours. When looking directly at the traffic sources, it turns out that, while hypergiants still contribute a significant fraction of traffic, we see (1) a higher increase in traffic of non-hypergiants, and (2) traffic increases in applications that people use when at home, such as Web conferencing, VPN, and gaming. While many networks see increased traffic demands, in particular, those providing services to residential users, academic networks experience major overall decreases. Yet, in these networks, we can observe substantial increases when considering applications associated to remote working and lecturing. △ Less

Submitted 5 October, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

Journal ref: Proceedings of the 2020 Internet Measurement Conference (IMC '20)

arXiv:2004.10887 [pdf, other]

Towards Runtime Verification of Programmable Switches

Authors: Apoorv Shukla, Kevin Hudemann, Zsolt Vági, Lily Hügerich, Georgios Smaragdakis, Stefan Schmid, Artur Hecker, Anja Feldmann

Abstract: Is it possible to patch software bugs in P4 programs without human involvement? We show that this is partially possible in many cases due to advances in software testing and the structure of P4 programs. Our insight is that runtime verification can detect bugs, even those that are not detected at compile-time, with machine learning-guided fuzzing. This enables a more automated and real-time locali… ▽ More Is it possible to patch software bugs in P4 programs without human involvement? We show that this is partially possible in many cases due to advances in software testing and the structure of P4 programs. Our insight is that runtime verification can detect bugs, even those that are not detected at compile-time, with machine learning-guided fuzzing. This enables a more automated and real-time localization of bugs in P4 programs using software testing techniques like Tarantula. Once the bug in a P4 program is localized, the faulty code can be patched due to the programmable nature of P4. In addition, platform-dependent bugs can be detected. From P4_14 to P4_16 (latest version), our observation is that as the programmable blocks increase, the patchability of P4 programs increases accordingly. To this end, we design, develop, and evaluate P6 that (a) detects, (b) localizes, and (c) patches bugs in P4 programs with minimal human interaction. P6 tests P4 switch non-intrusively, i.e., requires no modification to the P4 program for detecting and localizing bugs. We used a P6 prototype to detect and patch seven existing bugs in eight publicly available P4 application programs deployed on two different switch platforms: behavioral model (bmv2) and Tofino. Our evaluation shows that P6 significantly outperforms bug detection baselines while generating fewer packets and patches bugs in P4 programs such as switch.p4 without triggering any regressions. △ Less

Submitted 26 April, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: added a missing author name to a reference

arXiv:1908.02261 [pdf, other]

Who's Tracking Sensitive Domains?

Authors: Costas Iordanou, Georgios Smaragdakis, Nikolaos Laoutaris

Abstract: We turn our attention to the elephant in the room of data protection, which is none other than the simple and obvious question: "Who's tracking sensitive domains?". Despite a fast-growing amount of work on more complex facets of the interplay between privacy and the business models of the Web, the obvious question of who collects data on domains where most people would prefer not be seen, has rece… ▽ More We turn our attention to the elephant in the room of data protection, which is none other than the simple and obvious question: "Who's tracking sensitive domains?". Despite a fast-growing amount of work on more complex facets of the interplay between privacy and the business models of the Web, the obvious question of who collects data on domains where most people would prefer not be seen, has received rather limited attention. First, we develop a methodology for automatically annotating websites that belong to a sensitive category, e.g. as defined by the General Data Protection Regulation (GDPR). Then, we extract the third party tracking services included directly, or via recursive inclusions, by the above mentioned sites. Having analyzed around 30k sensitive domains, we show that such domains are tracked, albeit less intensely than the mainstream ones. Looking in detail at the tracking services operating on them, we find well known names, as well as some less known ones, including some specializing on specific sensitive categories. △ Less

Submitted 6 August, 2019; originally announced August 2019.

arXiv:1902.02165 [pdf, ps, other]

The Dagstuhl Beginners Guide to Reproducibility for Experimental Networking Research

Authors: Vaibhav Bajpai, Anna Brunstrom, Anja Feldmann, Wolfgang Kellerer, Aiko Pras, Henning Schulzrinne, Georgios Smaragdakis, Matthias Wählisch, Klaus Wehrle

Abstract: Reproducibility is one of the key characteristics of good science, but hard to achieve for experimental disciplines like Internet measurements and networked systems. This guide provides advice to researchers, particularly those new to the field, on designing experiments so that their work is more likely to be reproducible and to serve as a foundation for follow-on work by others. Reproducibility is one of the key characteristics of good science, but hard to achieve for experimental disciplines like Internet measurements and networked systems. This guide provides advice to researchers, particularly those new to the field, on designing experiments so that their work is more likely to be reproducible and to serve as a foundation for follow-on work by others. △ Less

Submitted 12 January, 2019; originally announced February 2019.

Journal ref: SIGCOMM Computer Communication Review (2019)

arXiv:1606.00360 [pdf, other]

doi 10.1145/2987443.2987473

Beyond Counting: New Perspectives on the Active IPv4 Address Space

Authors: Philipp Richter, Georgios Smaragdakis, David Plonka, Arthur Berger

Abstract: In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the hi… ▽ More In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the highest ever measured, in agreement with recent estimates. Monthly client IPv4 address counts showed constant growth for years prior, but since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it seems we have entered an era marked by increased complexity, one in which the sole enumeration of active IPv4 addresses is of little use to characterize recent growth of the Internet as a whole. With this observation in mind, we consider new points of view in the study of global IPv4 address activity. Our analysis shows significant churn in active IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over the course of a year. Second, by looking across the active addresses in a prefix, we are able to identify and attribute activity patterns to network restructurings, user behaviors, and, in particular, various address assignment practices. Third, by combining spatio-temporal measures of address utilization with measures of traffic volume, and sampling-based estimates of relative host counts, we present novel perspectives on worldwide IPv4 address activity, including empirical observation of under-utilization in some areas, and complete utilization, or exhaustion, in others. △ Less

Submitted 9 September, 2016; v1 submitted 1 June, 2016; originally announced June 2016.

Comments: in Proceedings of ACM IMC 2016

arXiv:1405.1811 [pdf, other]

Opportunities in a Federated Cloud Marketplace

Authors: Hamed Haddadi, Georgios Smaragdakis, K. K. Ramakrishnan

Abstract: Recent measurement studies show that there are massively distributed hosting and computing infrastructures deployed in the Internet. Such infrastructures include large data centers and organizations' computing clusters. When idle, these resources can readily serve local users. Such users can be smartphone or tablet users wishing to access services such as remote desktop or CPU/bandwidth intensive… ▽ More Recent measurement studies show that there are massively distributed hosting and computing infrastructures deployed in the Internet. Such infrastructures include large data centers and organizations' computing clusters. When idle, these resources can readily serve local users. Such users can be smartphone or tablet users wishing to access services such as remote desktop or CPU/bandwidth intensive activities. Particularly, when they are likely to have high latency to access, or may have no access at all to, centralized cloud providers. Today, however, there is no global marketplace where sellers and buyers of available resources can trade. The recently introduced marketplaces of Amazon and other cloud infrastructures are limited by the network footprint of their own infrastructures and availability of such services in the target country and region. In this article we discuss the potentials for a federated cloud marketplace where sellers and buyers of a number of resources, including storage, computing, and network bandwidth, can freely trade. This ecosystem can be regulated through brokers who act as service level monitors and auctioneers. We conclude by discussing the challenges and opportunities in this space. △ Less

Submitted 8 May, 2014; originally announced May 2014.

Comments: 4 pages

ACM Class: C.2.4

arXiv:1307.5264 [pdf]

On the importance of Internet eXchange Points for today's Internet ecosystem

Authors: Nikolaos Chatzis, Georgios Smaragdakis, Anja Feldmann

Abstract: Internet eXchange Points (IXPs) are generally considered to be the successors of the four Network Access Points that were mandated as part of the decommissioning of the NSFNET in 1994/95 to facilitate the transition from the NSFNET to the "public Internet" as we know it today. While this popular view does not tell the whole story behind the early beginnings of IXPs, what is true is that since arou… ▽ More Internet eXchange Points (IXPs) are generally considered to be the successors of the four Network Access Points that were mandated as part of the decommissioning of the NSFNET in 1994/95 to facilitate the transition from the NSFNET to the "public Internet" as we know it today. While this popular view does not tell the whole story behind the early beginnings of IXPs, what is true is that since around 1994, the number of operational IXPs worldwide has grown to more than 300 (as of May 2013), with the largest IXPs handling daily traffic volumes comparable to those carried by the largest Tier-1 ISPs, but IXPs have never really attracted any attention from the networking research community. At first glance, this lack of interest seems understandable as IXPs have apparently little to do with current "hot" topic areas such as data centers and cloud services or software defined networking (SDN) and mobile communication. However, we argue in this article that, in fact, IXPs are all about data centers and cloud services and even SDN and mobile communication and should be of great interest to networking researchers interested in understanding the current and future Internet ecosystem. To this end, we survey the existing but largely unknown sources of publicly available information about IXPs to describe their basic technical and operational aspects and highlight the critical differences among the various IXPs in the different regions of the world, especially in Europe and North America. More importantly, we illustrate the important role that IXPs play in today's Internet ecosystem and discuss how IXP-driven innovation in Europe is sha** and redefining the Internet marketplace, not only in Europe but increasingly so around the world. △ Less

Submitted 27 July, 2013; v1 submitted 19 July, 2013; originally announced July 2013.

Comments: 10 pages, keywords: Internet Exchange Point, Internet Architecture, Peering, Content Delivery

arXiv:1202.1464 [pdf, ps, other]

Content-aware Traffic Engineering

Authors: Benjamin Frank, Ingmar Poese, Georgios Smaragdakis, Steve Uhlig, Anja Feldmann

Abstract: Today, a large fraction of Internet traffic is originated by Content Providers (CPs) such as content distribution networks and hyper-giants. To cope with the increasing demand for content, CPs deploy massively distributed infrastructures. This poses new challenges for CPs as they have to dynamically map end-users to appropriate servers, without being fully aware of network conditions within an ISP… ▽ More Today, a large fraction of Internet traffic is originated by Content Providers (CPs) such as content distribution networks and hyper-giants. To cope with the increasing demand for content, CPs deploy massively distributed infrastructures. This poses new challenges for CPs as they have to dynamically map end-users to appropriate servers, without being fully aware of network conditions within an ISP as well as the end-users network locations. Furthermore, ISPs struggle to cope with rapid traffic shifts caused by the dynamic server selection process of CPs. In this paper, we argue that the challenges that CPs and ISPs face separately today can be turned into an opportunity. We show how they can jointly take advantage of the deployed distributed infrastructures to improve their operation and end-user performance. We propose Content-aware Traffic Engineering (CaTE), which dynamically adapts the traffic demand for content hosted on CPs by utilizing ISP network information and end-user location during the server selection process. As a result, CPs enhance their end-user to server map** and improve end-user experience, thanks to the ability of network-informed server selection to circumvent network bottlenecks. In addition, ISPs gain the ability to partially influence the traffic demands in their networks. Our results with operational data show improvements in path length and delay between end-user and the assigned CP server, network wide traffic reduction of up to 15%, and a decrease in ISP link utilization of up to 40% when applying CaTE to traffic delivered by a small number of major CPs. △ Less

Submitted 7 February, 2012; originally announced February 2012.

Comments: Also appears as TU-Berlin technical report 2012-3, ISSN: 1436-9915

Showing 1–20 of 20 results for author: Smaragdakis, G