-
Watching the Watchers: Nonce-based Inverse Surveillance to Remotely Detect Monitoring
Authors:
Laura M. Roberts,
David Plonka
Abstract:
Internet users and service providers do not often know when traffic is being watched but desire a way to determine when, where, and by whom. We present NOISE, the Nonce Observatory for Inverse Surveillance of Eavesdroppers, a method and system that detects monitoring by disseminating nonces - unique, pseudorandom values - in traffic and seeing if they are acted upon unexpectedly, indicating that t…
▽ More
Internet users and service providers do not often know when traffic is being watched but desire a way to determine when, where, and by whom. We present NOISE, the Nonce Observatory for Inverse Surveillance of Eavesdroppers, a method and system that detects monitoring by disseminating nonces - unique, pseudorandom values - in traffic and seeing if they are acted upon unexpectedly, indicating that the nonce-laden traffic is being monitored. Specifically, we embed 64-bit nonces innocuously into IPv6 addresses and disseminate these nonces Internet-wide using a modified traceroute-like tool that makes each outbound probe's source address unique. We continually monitor for subsequent nonce propagation, i.e., activity or interest involving these nonces, e.g., via packet capture on our system's infrastructure. Across three experiments and four months, NOISE detects monitoring more than 200k times, ostensibly in 268 networks, for probes destined for 437 networks. Our results reveal: (a) data collection for security incident handling, (b) traffic information being shared with third parties, and (c) eavesdrop** in or near a large commercial peering exchange.
△ Less
Submitted 5 June, 2020; v1 submitted 15 May, 2020;
originally announced May 2020.
-
In the IP of the Beholder: Strategies for Active IPv6 Topology Discovery
Authors:
Robert Beverly,
Ramakrishnan Durairajan,
David Plonka,
Justin P. Rohrer
Abstract:
Existing methods for active topology discovery within the IPv6 Internet largely mirror those of IPv4. In light of the large and sparsely populated address space, in conjunction with aggressive ICMPv6 rate limiting by routers, this work develops a different approach to Internet-wide IPv6 topology map**. We adopt randomized probing techniques in order to distribute probing load, minimize the effec…
▽ More
Existing methods for active topology discovery within the IPv6 Internet largely mirror those of IPv4. In light of the large and sparsely populated address space, in conjunction with aggressive ICMPv6 rate limiting by routers, this work develops a different approach to Internet-wide IPv6 topology map**. We adopt randomized probing techniques in order to distribute probing load, minimize the effects of rate limiting, and probe at higher rates. Second, we extensively analyze the efficiency and efficacy of various IPv6 hitlists and target generation methods when used for topology discovery, and synthesize new target lists based on our empirical results to provide both breadth (coverage across networks) and depth (to find potential subnetting). Employing our probing strategy, we discover more than 1.3M IPv6 router interface addresses from a single vantage point. Finally, we share our prober implementation, synthesized target lists, and discovered IPv6 topology results.
△ Less
Submitted 9 October, 2018; v1 submitted 29 May, 2018;
originally announced May 2018.
-
kIP: a Measured Approach to IPv6 Address Anonymization
Authors:
David Plonka,
Arthur Berger
Abstract:
Privacy-minded Internet service operators anonymize IPv6 addresses by truncating them to a fixed length, perhaps due to long-standing use of this technique with IPv4 and a belief that it's "good enough." We claim that simple anonymization by truncation is suspect since it does not entail privacy guarantees nor does it take into account some common address assignment practices observed today. To in…
▽ More
Privacy-minded Internet service operators anonymize IPv6 addresses by truncating them to a fixed length, perhaps due to long-standing use of this technique with IPv4 and a belief that it's "good enough." We claim that simple anonymization by truncation is suspect since it does not entail privacy guarantees nor does it take into account some common address assignment practices observed today. To investigate, with standard activity logs as input, we develop a counting method to determine a lower bound on the number of active IPv6 addresses that are simultaneously assigned, such as those of clients that access World-Wide Web services. In many instances, we find that these empirical measurements offer no evidence that truncating IPv6 addresses to a fixed number of bits, e.g., 48 in common practice, protects individuals' privacy.
To remedy this problem, we propose kIP anonymization, an aggregation method that ensures a certain level of address privacy. Our method adaptively determines variable truncation lengths using parameter k, the desired number of active (rather than merely potential) addresses, e.g., 32 or 256, that can not be distinguished from each other once anonymized. We describe our implementation and present first results of its application to millions of real IPv6 client addresses active over a week's time, demonstrating both feasibility at large scale and ability to automatically adapt to each network's address assignment practice and synthesize a set of anonymous aggregates (prefixes), each of which is guaranteed to cover (contain) at least k of the active addresses. Each address is anonymized by truncating it to the length of its longest matching prefix in that set.
△ Less
Submitted 12 July, 2017;
originally announced July 2017.
-
Entropy/IP: Uncovering Structure in IPv6 Addresses
Authors:
Pawel Foremski,
David Plonka,
Arthur Berger
Abstract:
In this paper, we introduce Entropy/IP: a system that discovers Internet address structure based on analyses of a subset of IPv6 addresses known to be active, i.e., training data, gleaned by readily available passive and active means. The system is completely automated and employs a combination of information-theoretic and machine learning techniques to probabilistically model IPv6 addresses. We p…
▽ More
In this paper, we introduce Entropy/IP: a system that discovers Internet address structure based on analyses of a subset of IPv6 addresses known to be active, i.e., training data, gleaned by readily available passive and active means. The system is completely automated and employs a combination of information-theoretic and machine learning techniques to probabilistically model IPv6 addresses. We present results showing that our system is effective in exposing structural characteristics of portions of the IPv6 Internet address space populated by active client, service, and router addresses.
In addition to visualizing the address structure for exploration, the system uses its models to generate candidate target addresses for scanning. For each of 15 evaluated datasets, we train on 1K addresses and generate 1M candidates for scanning. We achieve some success in 14 datasets, finding up to 40% of the generated addresses to be active. In 11 of these datasets, we find active network identifiers (e.g., /64 prefixes or `subnets') not seen in training. Thus, we provide the first evidence that it is practical to discover subnets and hosts by scanning probabilistically selected areas of the IPv6 address space not known to contain active hosts a priori.
△ Less
Submitted 21 November, 2016; v1 submitted 14 June, 2016;
originally announced June 2016.
-
Beyond Counting: New Perspectives on the Active IPv4 Address Space
Authors:
Philipp Richter,
Georgios Smaragdakis,
David Plonka,
Arthur Berger
Abstract:
In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the hi…
▽ More
In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the highest ever measured, in agreement with recent estimates. Monthly client IPv4 address counts showed constant growth for years prior, but since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it seems we have entered an era marked by increased complexity, one in which the sole enumeration of active IPv4 addresses is of little use to characterize recent growth of the Internet as a whole.
With this observation in mind, we consider new points of view in the study of global IPv4 address activity. Our analysis shows significant churn in active IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over the course of a year. Second, by looking across the active addresses in a prefix, we are able to identify and attribute activity patterns to network restructurings, user behaviors, and, in particular, various address assignment practices. Third, by combining spatio-temporal measures of address utilization with measures of traffic volume, and sampling-based estimates of relative host counts, we present novel perspectives on worldwide IPv4 address activity, including empirical observation of under-utilization in some areas, and complete utilization, or exhaustion, in others.
△ Less
Submitted 9 September, 2016; v1 submitted 1 June, 2016;
originally announced June 2016.
-
Temporal and Spatial Classification of Active IPv6 Addresses
Authors:
David Plonka,
Arthur Berger
Abstract:
There is striking volume of World-Wide Web activity on IPv6 today. In early 2015, one large Content Distribution Network handles 50 billion IPv6 requests per day from hundreds of millions of IPv6 client addresses; billions of unique client addresses are observed per month. Address counts, however, obscure the number of hosts with IPv6 connectivity to the global Internet. There are numerous address…
▽ More
There is striking volume of World-Wide Web activity on IPv6 today. In early 2015, one large Content Distribution Network handles 50 billion IPv6 requests per day from hundreds of millions of IPv6 client addresses; billions of unique client addresses are observed per month. Address counts, however, obscure the number of hosts with IPv6 connectivity to the global Internet. There are numerous address assignment and subnetting options in use; privacy addresses and dynamic subnet pools significantly inflate the number of active IPv6 addresses. As the IPv6 address space is vast, it is infeasible to comprehensively probe every possible unicast IPv6 address. Thus, to survey the characteristics of IPv6 addressing, we perform a year-long passive measurement study, analyzing the IPv6 addresses gleaned from activity logs for all clients accessing a global CDN.
The goal of our work is to develop flexible classification and measurement methods for IPv6, motivated by the fact that its addresses are not merely more numerous; they are different in kind. We introduce the notion of classifying addresses and prefixes in two ways: (1) temporally, according to their instances of activity to discern which addresses can be considered stable; (2) spatially, according to the density or sparsity of aggregates in which active addresses reside. We present measurement and classification results numerically and visually that: provide details on IPv6 address use and structure in global operation across the past year; establish the efficacy of our classification methods; and demonstrate that such classification can clarify dimensions of the Internet that otherwise appear quite blurred by current IPv6 addressing practices.
△ Less
Submitted 17 July, 2015; v1 submitted 26 June, 2015;
originally announced June 2015.