-
Develo** a Series of AI Challenges for the United States Department of the Air Force
Authors:
Vijay Gadepally,
Gregory Angelides,
Andrei Barbu,
Andrew Bowne,
Laura J. Brattain,
Tamara Broderick,
Armando Cabrera,
Glenn Carl,
Ronisha Carter,
Miriam Cha,
Emilie Cowen,
Jesse Cummings,
Bill Freeman,
James Glass,
Sam Goldberg,
Mark Hamilton,
Thomas Heldt,
Kuan Wei Huang,
Phillip Isola,
Boris Katz,
Jamie Koerner,
Yen-Chen Lin,
David Mayo,
Kyle McAlpin,
Taylor Perron
, et al. (17 additional authors not shown)
Abstract:
Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requireme…
▽ More
Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requirements. Several projects supported by the DAF-MIT AI Accelerator are develo** public challenge problems that address numerous Federal AI research priorities. These challenges target priorities by making large, AI-ready datasets publicly available, incentivizing open-source solutions, and creating a demand signal for dual use technologies that can stimulate further research. In this article, we describe these public challenges being developed and how their application contributes to scientific advances.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
MultiEarth 2022 -- Multimodal Learning for Earth and Environment Workshop and Challenge
Authors:
Miriam Cha,
Kuan Wei Huang,
Morgan Schmidt,
Gregory Angelides,
Mark Hamilton,
Sam Goldberg,
Armando Cabrera,
Phillip Isola,
Taylor Perron,
Bill Freeman,
Yen-Chen Lin,
Brandon Swenson,
Jean Piou
Abstract:
The Multimodal Learning for Earth and Environment Challenge (MultiEarth 2022) will be the first competition aimed at the monitoring and analysis of deforestation in the Amazon rainforest at any time and in any weather conditions. The goal of the Challenge is to provide a common benchmark for multimodal information processing and to bring together the earth and environmental science communities as…
▽ More
The Multimodal Learning for Earth and Environment Challenge (MultiEarth 2022) will be the first competition aimed at the monitoring and analysis of deforestation in the Amazon rainforest at any time and in any weather conditions. The goal of the Challenge is to provide a common benchmark for multimodal information processing and to bring together the earth and environmental science communities as well as multimodal representation learning communities to compare the relative merits of the various multimodal learning methods to deforestation estimation under well-defined and strictly comparable conditions. MultiEarth 2022 will have three sub-challenges: 1) matrix completion, 2) deforestation estimation, and 3) image-to-image translation. This paper presents the challenge guidelines, datasets, and evaluation metrics for the three sub-challenges. Our challenge website is available at https://sites.google.com/view/rainforest-challenge.
△ Less
Submitted 31 May, 2022; v1 submitted 15 April, 2022;
originally announced April 2022.
-
Dr. Watson type Artificial Intellect (AI) Systems
Authors:
Saveli Goldberg,
Stanislav Belyaev,
Vladimir Sluchak
Abstract:
The article proposes a new type of AI system that does not give solutions directly but rather points toward it, friendly prompting the user with questions and adjusting messages. Models of AI human collaboration can be deduced from the classic literary example of interaction between Mr. Holmes and Dr. Watson from the stories by Conan Doyle, where the highly qualified expert Mr. Holmes answers ques…
▽ More
The article proposes a new type of AI system that does not give solutions directly but rather points toward it, friendly prompting the user with questions and adjusting messages. Models of AI human collaboration can be deduced from the classic literary example of interaction between Mr. Holmes and Dr. Watson from the stories by Conan Doyle, where the highly qualified expert Mr. Holmes answers questions posed by Dr. Watson. Here Mr. Holmes, with his rule-based calculations, logic, and memory management, apparently plays the role of an AI system, and Dr. Watson is the user. Looking into the same Holmes-Watson interaction, we find and promote another model in which the AI behaves like Dr. Watson, who, by asking questions and acting in a particular way, helps Holmes (the AI user) make the right decisions. We call the systems based on this principle "Dr. Watson-type systems." The article describes the properties of such systems and introduces two particular: Patient Management System for intensive care physicians and Data Error Prevention System.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Passport: Enabling Accurate Country-Level Router Geolocation using Inaccurate Sources
Authors:
Muzammil Abdul Rehman,
Sharon Goldberg,
David Choffnes
Abstract:
When does Internet traffic cross international borders? This question has major geopolitical, legal and social implications and is surprisingly difficult to answer. A critical stumbling block is a dearth of tools that accurately map routers traversed by Internet traffic to the countries in which they are located. This paper presents Passport: a new approach for efficient, accurate country-level ro…
▽ More
When does Internet traffic cross international borders? This question has major geopolitical, legal and social implications and is surprisingly difficult to answer. A critical stumbling block is a dearth of tools that accurately map routers traversed by Internet traffic to the countries in which they are located. This paper presents Passport: a new approach for efficient, accurate country-level router geolocation and a system that implements it. Passport provides location predictions with limited active measurements, using machine learning to combine information from IP geolocation databases, router hostnames, whois records, and ** measurements. We show that Passport substantially outperforms existing techniques, and identify cases where paths traverse countries with implications for security, privacy, and performance.
△ Less
Submitted 23 July, 2019; v1 submitted 12 May, 2019;
originally announced May 2019.
-
Modelling Citation Networks
Authors:
S. R. Goldberg,
H. Anthony,
T. S. Evans
Abstract:
The distribution of the number of academic publications as a function of citation count for a given year is remarkably similar from year to year. We measure this similarity as a width of the distribution and find it to be approximately constant from year to year. We show that simple citation models fail to capture this behaviour. We then provide a simple three parameter citation network model usin…
▽ More
The distribution of the number of academic publications as a function of citation count for a given year is remarkably similar from year to year. We measure this similarity as a width of the distribution and find it to be approximately constant from year to year. We show that simple citation models fail to capture this behaviour. We then provide a simple three parameter citation network model using a mixture of local and global search processes which can reproduce the correct distribution over time. We use the citation network of papers from the hep-th section of arXiv to test our model. For this data, around 20% of citations use global information to reference recently published papers, while the remaining 80% are found using local searches. We note that this is consistent with other studies though our motivation is very different from previous work. Finally, we also find that the fluctuations in the size of an academic publication's bibliography is important for the model. This is not addressed in most models and needs further work.
△ Less
Submitted 13 August, 2014;
originally announced August 2014.
-
BGP Security in Partial Deployment: Is the Juice Worth the Squeeze?
Authors:
Robert Lychev,
Sharon Goldberg,
Michael Schapira
Abstract:
As the rollout of secure route origin authentication with the RPKI slowly gains traction among network operators, there is a push to standardize secure path validation for BGP (i.e., S*BGP: S-BGP, soBGP, BGPSEC, etc.). Origin authentication already does much to improve routing security. Moreover, the transition to S*BGP is expected to be long and slow, with S*BGP coexisting in "partial deployment"…
▽ More
As the rollout of secure route origin authentication with the RPKI slowly gains traction among network operators, there is a push to standardize secure path validation for BGP (i.e., S*BGP: S-BGP, soBGP, BGPSEC, etc.). Origin authentication already does much to improve routing security. Moreover, the transition to S*BGP is expected to be long and slow, with S*BGP coexisting in "partial deployment" alongside BGP for a long time. We therefore use theoretical and experimental approach to study the security benefits provided by partially-deployed S*BGP, vis-a-vis those already provided by origin authentication. Because routing policies have a profound impact on routing security, we use a survey of 100 network operators to find the policies that are likely to be most popular during partial S*BGP deployment. We find that S*BGP provides only meagre benefits over origin authentication when these popular policies are used. We also study the security benefits of other routing policies, provide prescriptive guidelines for partially-deployed S*BGP, and show how interactions between S*BGP and BGP can introduce new vulnerabilities into the routing system.
△ Less
Submitted 10 July, 2013;
originally announced July 2013.
-
Calibrating Data to Sensitivity in Private Data Analysis
Authors:
Davide Proserpio,
Sharon Goldberg,
Frank McSherry
Abstract:
We present an approach to differentially private computation in which one does not scale up the magnitude of noise for challenging queries, but rather scales down the contributions of challenging records. While scaling down all records uniformly is equivalent to scaling up the noise magnitude, we show that scaling records non-uniformly can result in substantially higher accuracy by bypassing the w…
▽ More
We present an approach to differentially private computation in which one does not scale up the magnitude of noise for challenging queries, but rather scales down the contributions of challenging records. While scaling down all records uniformly is equivalent to scaling up the noise magnitude, we show that scaling records non-uniformly can result in substantially higher accuracy by bypassing the worst-case requirements of differential privacy for the noise magnitudes. This paper details the data analysis platform wPINQ, which generalizes the Privacy Integrated Query (PINQ) to weighted datasets. Using a few simple operators (including a non-uniformly scaling Join operator) wPINQ can reproduce (and improve) several recent results on graph analysis and introduce new generalizations (e.g., counting triangles with given degrees). We also show how to integrate probabilistic inference techniques to synthesize datasets respecting more complicated (and less easily interpreted) measurements.
△ Less
Submitted 4 May, 2014; v1 submitted 15 March, 2012;
originally announced March 2012.
-
Network-Destabilizing Attacks
Authors:
Robert Lychev,
Sharon Goldberg,
Michael Schapira
Abstract:
The Border Gateway Protocol (BGP) sets up routes between the smaller networks that make up the Internet. Despite its crucial role, BGP is notoriously vulnerable to serious problems, including (1) propagation of bogus routing information due to attacks or misconfigurations, and (2) network instabilities in the form of persistent routing oscillations. The conditions required to avoid BGP instabiliti…
▽ More
The Border Gateway Protocol (BGP) sets up routes between the smaller networks that make up the Internet. Despite its crucial role, BGP is notoriously vulnerable to serious problems, including (1) propagation of bogus routing information due to attacks or misconfigurations, and (2) network instabilities in the form of persistent routing oscillations. The conditions required to avoid BGP instabilities are quite delicate. How, then, can we explain the observed stability of today's Internet in the face of common configuration errors and attacks? This work explains this phenomenon by first noticing that almost every observed attack and misconfiguration to date shares a common characteristic: even when a router announces egregiously bogus information, it will continue to announce the same bogus information for the duration of its attack/misconfiguration. We call these the "fixed-route attacks", and show that, while even simple fixed-route attacks can destabilize a network, the commercial routing policies used in today's Internet prevent such attacks from creating instabilities.
△ Less
Submitted 30 August, 2012; v1 submitted 7 March, 2012;
originally announced March 2012.
-
The Diffusion of Networking Technologies
Authors:
Sharon Goldberg,
Zhenming Liu
Abstract:
There has been significant interest in the networking community on the impact of cascade effects on the diffusion of networking technology upgrades in the Internet. Thinking of the global Internet as a graph, where each node represents an economically-motivated Internet Service Provider (ISP), a key problem is to determine the smallest set of nodes that can trigger a cascade that causes every othe…
▽ More
There has been significant interest in the networking community on the impact of cascade effects on the diffusion of networking technology upgrades in the Internet. Thinking of the global Internet as a graph, where each node represents an economically-motivated Internet Service Provider (ISP), a key problem is to determine the smallest set of nodes that can trigger a cascade that causes every other node in the graph to adopt the protocol. We design the first approximation algorithm with a provable performance guarantee for this problem, in a model that captures the following key issue: a node's decision to upgrade should be influenced by the decisions of the remote nodes it wishes to communicate with.
Given an internetwork G(V,E) and threshold function θ, we assume that node $u$ activates (upgrades to the new technology) when it is adjacent to a connected component of active nodes in G of size exceeding node $u$'s threshold θ(u). Our objective is to choose the smallest set of nodes that can cause the rest of the graph to activate. Our main contribution is an approximation algorithm based on linear programming, which we complement with computational hardness results and a near-optimum integrality gap. Our algorithm, which does not rely on submodular optimization techniques, also highlights the substantial algorithmic difference between our problem and similar questions studied in the context of social networks.
△ Less
Submitted 26 November, 2012; v1 submitted 13 February, 2012;
originally announced February 2012.
-
A minimal nonfinitely based semigroup whose variety is polynomially recognizable
Authors:
Mikhail V. Volkov,
Svetlana V. Goldberg,
Stanislav I. Kublanovsky
Abstract:
We exhibit a 6-element semigroup that has no finite identity basis but nevertheless generates a variety whose finite membership problem admits a polynomial algorithm.
We exhibit a 6-element semigroup that has no finite identity basis but nevertheless generates a variety whose finite membership problem admits a polynomial algorithm.
△ Less
Submitted 14 August, 2010;
originally announced August 2010.