-
Challenges for Predictive Modeling with Neural Network Techniques using Error-Prone Dietary Intake Data
Authors:
Dylan Spicker,
Amir Nazemi,
Joy Hutchinson,
Paul Fieguth,
Sharon I. Kirkpatrick,
Michael Wallace,
Kevin W. Dodd
Abstract:
Dietary intake data are routinely drawn upon to explore diet-health relationships. However, these data are often subject to measurement error, distorting the true relationships. Beyond measurement error, there are likely complex synergistic and sometimes antagonistic interactions between different dietary components, complicating the relationships between diet and health outcomes. Flexible models…
▽ More
Dietary intake data are routinely drawn upon to explore diet-health relationships. However, these data are often subject to measurement error, distorting the true relationships. Beyond measurement error, there are likely complex synergistic and sometimes antagonistic interactions between different dietary components, complicating the relationships between diet and health outcomes. Flexible models are required to capture the nuance that these complex interactions introduce. This complexity makes research on diet-health relationships an appealing candidate for the application of machine learning techniques, and in particular, neural networks. Neural networks are computational models that are able to capture highly complex, nonlinear relationships so long as sufficient data are available. While these models have been applied in many domains, the impacts of measurement error on the performance of predictive modeling has not been systematically investigated. However, dietary intake data are typically collected using self-report methods and are prone to large amounts of measurement error. In this work, we demonstrate the ways in which measurement error erodes the performance of neural networks, and illustrate the care that is required for leveraging these models in the presence of error. We demonstrate the role that sample size and replicate measurements play on model performance, indicate a motivation for the investigation of transformations to additivity, and illustrate the caution required to prevent model overfitting. While the past performance of neural networks across various domains make them an attractive candidate for examining diet-health relationships, our work demonstrates that substantial care and further methodological development are both required to observe increased predictive performance when applying these techniques, compared to more traditional statistical procedures.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches
Authors:
Chi-en Amy Tai,
Matthew Keller,
Saeejith Nair,
Yuhao Chen,
Yifan Wu,
Olivia Markham,
Krish Parmar,
Pengcheng Xi,
Heather Keller,
Sharon Kirkpatrick,
Alexander Wong
Abstract:
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs an…
▽ More
Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs and may necessitate trained personnel. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images, but the lack of comprehensive datasets with diverse viewpoints, modalities and food annotations hinders the accuracy and realism of such methods. To address this limitation, we introduce NutritionVerse-Synth, the first large-scale dataset of 84,984 photorealistic synthetic 2D food images with associated dietary information and multimodal annotations (including depth images, instance masks, and semantic masks). Additionally, we collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism. Leveraging these novel datasets, we develop and benchmark NutritionVerse, an empirical study of various dietary intake estimation approaches, including indirect segmentation-based and direct prediction networks. We further fine-tune models pretrained on synthetic data with real images to provide insights into the fusion of synthetic and real data. Finally, we release both datasets (NutritionVerse-Synth, NutritionVerse-Real) on https://www.kaggle.com/nutritionverse/datasets as part of an open initiative to accelerate machine learning for dietary sensing.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Engineering of Niobium Surfaces Through Accelerated Neutral Atom Beam Technology For Quantum Applications
Authors:
Soumen Kar,
Conan Weiland,
Chenyu Zhou,
Ekta Bhatia,
Brian Martinick,
Jakub Nalaskowski,
John Mucci,
Stephen Olson,
Pui Yee Hung,
Ilyssa Wells,
Hunter Frost,
Corbet S. Johnson,
Thomas Murray,
Vidya Kaushik,
Sean Kirkpatrick,
Kiet Chau,
Michael J. Walsh,
Mingzhao Liu,
Satyavolu S. Papa Rao
Abstract:
A major roadblock to scalable quantum computing is phase decoherence and energy relaxation caused by qubits interacting with defect-related two-level systems (TLS). Native oxides present on the surfaces of superconducting metals used in quantum devices are acknowledged to be a source of TLS that decrease qubit coherence times. Reducing microwave loss by surface engineering (i.e., replacing uncontr…
▽ More
A major roadblock to scalable quantum computing is phase decoherence and energy relaxation caused by qubits interacting with defect-related two-level systems (TLS). Native oxides present on the surfaces of superconducting metals used in quantum devices are acknowledged to be a source of TLS that decrease qubit coherence times. Reducing microwave loss by surface engineering (i.e., replacing uncontrolled native oxide of superconducting metals with a thin, stable surface with predictable characteristics) can be a key enabler for pushing performance forward with devices of higher quality factor. In this work, we present a novel approach to replace the native oxide of niobium (typically formed in an uncontrolled fashion when its pristine surface is exposed to air) with an engineered oxide, using a room-temperature process that leverages Accelerated Neutral Atom Beam (ANAB) technology at 300 mm wafer scale. This ANAB beam is composed of a mixture of argon and oxygen, with tunable energy per atom, which is rastered across the wafer surface. The ANAB-engineered Nb-oxide thickness was found to vary from 2 nm to 6 nm depending on ANAB process parameters. Modeling of variable-energy XPS data confirm thickness and compositional control of the Nb surface oxide by the ANAB process. These results correlate well with those from transmission electron microscopy and X-ray reflectometry. Since ANAB is broadly applicable to material surfaces, the present study indicates its promise for modification of the surfaces of superconducting quantum circuits to achieve longer coherence times.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Simulated annealing, optimization, searching for ground states
Authors:
Sergio Caracciolo,
Alexander K. Hartmann,
Scott Kirkpatrick,
Martin Weigel
Abstract:
The chapter starts with a historical summary of first attempts to optimize the spin glass Hamiltonian, comparing it to recent results on searching largest cliques in random graphs. Exact algorithms to find ground states in generic spin glass models are then explored in Section 1.2, while Section 1.3 is dedicated to the bidimensional case where polynomial algorithms exist and allow for the study of…
▽ More
The chapter starts with a historical summary of first attempts to optimize the spin glass Hamiltonian, comparing it to recent results on searching largest cliques in random graphs. Exact algorithms to find ground states in generic spin glass models are then explored in Section 1.2, while Section 1.3 is dedicated to the bidimensional case where polynomial algorithms exist and allow for the study of much larger systems. Finally Section 1.4 presents a summary of results for the assignment problem where the finite size corrections for the ground state can be studied in great detail.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.
-
Impulse Measurement Methods for Pulsed Laser Ablation Propulsion
Authors:
Ishaan Mishra,
Scott Kirkpatrick
Abstract:
Pulsed laser ablation propulsion has the potential to revolutionize space exploration by eliminating the requirement of a spacecraft to carry its propellant and power source as the high-power laser is situated off-board. More experimentation needs to be done to optimize this propulsion system and understand the mechanisms of thrust generation. There are many methods used to calculate the impulse i…
▽ More
Pulsed laser ablation propulsion has the potential to revolutionize space exploration by eliminating the requirement of a spacecraft to carry its propellant and power source as the high-power laser is situated off-board. More experimentation needs to be done to optimize this propulsion system and understand the mechanisms of thrust generation. There are many methods used to calculate the impulse imparted in pulsed laser ablation experiments. In this paper, key performance parameters are derived for some of the impulse measurement methods used in ablation propulsion experiments. Regimes discussed include the torsional pendulum system, simple pendulum system, and solid and liquid microspheres.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Hard Optimization Problems have Soft Edges
Authors:
Raffaele Marino,
Scott Kirkpatrick
Abstract:
Finding a Maximum Clique is a classic property test from graph theory; find any one of the largest complete subgraphs in an Erdös-Rényi G(N, p) random graph. We use Maximum Clique to explore the structure of the problem as a function of N, the graph size, and K, the clique size sought. It displays a complex phase boundary, a staircase of steps at each of which 2log2 N and Kmax, the maximum size of…
▽ More
Finding a Maximum Clique is a classic property test from graph theory; find any one of the largest complete subgraphs in an Erdös-Rényi G(N, p) random graph. We use Maximum Clique to explore the structure of the problem as a function of N, the graph size, and K, the clique size sought. It displays a complex phase boundary, a staircase of steps at each of which 2log2 N and Kmax, the maximum size of a clique that can be found, increases by 1. Each of its boundaries has a finite width, and these widths allow local algorithms to find cliques beyond the limits defined by the study of infinite systems. We explore the performance of a number of extensions of traditional fast local algorithms, and find that much of the "hard" space remains accessible at finite N. The "hidden clique" problem embeds a clique somewhat larger than those which occur naturally in a G(N, p) random graph. Since such a clique is unique, we find that local searches which stop early, once evidence for the hidden clique is found, may outperform the best message passing or spectral algorithms.
△ Less
Submitted 25 May, 2023; v1 submitted 11 September, 2022;
originally announced September 2022.
-
Comparing Unit Trains versus Manifest Trains for the Risk of Rail Transport of Hazardous Materials -- Part II: Application and Case Study
Authors:
Di Kang,
Jiaxi Zhao,
C. Tyler Dick,
Xiang Liu,
Zheyong Bian,
Steven W. Kirkpatrick,
Chen-Yu Lin
Abstract:
Built upon the risk analysis methodology (presented in the part I paper), this part II paper focuses on applying this methodology. Five illustrative scenarios were used to analyze the best or worst cases and compare the transportation risk differences between service options using unit trains and manifest trains. The comparison results indicate that if all tank cars are placed at the positions wit…
▽ More
Built upon the risk analysis methodology (presented in the part I paper), this part II paper focuses on applying this methodology. Five illustrative scenarios were used to analyze the best or worst cases and compare the transportation risk differences between service options using unit trains and manifest trains. The comparison results indicate that if all tank cars are placed at the positions with the lowest probability of derailing and if switching tank cars alone in classification yards, it could provide the lowest risk estimate given the same transportation demand (i.e., number of tank cars to transport). This paper also shows that based on the data and parameters in the case study, risks during arrival/departure events and yard switching events could be as significant as risks that on mainlines. This paper provides a way to use the risk analysis methodology for rail safety decisions. The methodology and its application can be tailored to specific infrastructure and rolling stock characteristics.
△ Less
Submitted 4 July, 2022;
originally announced August 2022.
-
Comparing Unit Trains versus Manifest Trains for the Risk of Rail Transport of Hazardous Materials -- Part I: Risk Analysis Methodology
Authors:
Di Kang,
Jiaxi Zhao,
C. Tyler Dick,
Xiang Liu,
Zheyong Bian,
Steven W. Kirkpatrick,
Chen-Yu Lin
Abstract:
Transporting hazardous materials (hazmats) using tank cars has more significant economic benefits than other transportation modes. Although railway transportation is roughly four times more fuel-efficient than roadway transportation, a train derailment has greater potential to cause more disastrous consequences than a truck incident. Train types, such as unit train or manifest train (also called m…
▽ More
Transporting hazardous materials (hazmats) using tank cars has more significant economic benefits than other transportation modes. Although railway transportation is roughly four times more fuel-efficient than roadway transportation, a train derailment has greater potential to cause more disastrous consequences than a truck incident. Train types, such as unit train or manifest train (also called mixed train), can influence transport risks in several ways. For example, unit trains only experience risks on mainlines and when arriving at or departing from terminals, while manifest trains experience additional switching risks in yards. Based on prior studies and various data sources covering the years 1996-2018, this paper constructs event chains for line-haul risks on mainlines (for both unit trains and manifest trains), arrival/departure risks in terminals (for unit trains) and yards (for manifest trains), and yard switching risks for manifest trains using various probabilistic models, and finally determines expected casualties as the consequences of a potential train derailment and release incident. This is the first analysis to quantify the total risks a train may encounter throughout the shipment process, either on mainlines or in yards/terminals, distinguishing train types. It provides a methodology applicable to any train to calculate the expected risks (quantified as expected casualties in this paper) from an origin to a destination.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Large independent sets on random $d$-regular graphs with fixed degree $d$
Authors:
Raffaele Marino,
Scott Kirkpatrick
Abstract:
This paper presents a linear prioritized local algorithm that computes large independent sets on a random $d$-regular graph with small and fixed degree $d$. We studied experimentally the independence ratio obtained by the algorithm when $ d \in [3,100]$. For all $d \in [5,100]$, our results are larger than lower bounds calculated by exact methods, thus providing improved estimates of lower bounds.
This paper presents a linear prioritized local algorithm that computes large independent sets on a random $d$-regular graph with small and fixed degree $d$. We studied experimentally the independence ratio obtained by the algorithm when $ d \in [3,100]$. For all $d \in [5,100]$, our results are larger than lower bounds calculated by exact methods, thus providing improved estimates of lower bounds.
△ Less
Submitted 17 August, 2021; v1 submitted 27 March, 2020;
originally announced March 2020.
-
From Megabits to CPU~Ticks: Enriching a Demand Trace in the Age of MEC
Authors:
Francesco Malandrino,
Carla Fabiana Chiasserini,
Giuseppe Avino,
Marco Malinverno,
Scott Kirkpatrick
Abstract:
All the content consumed by mobile users, be it a web page or a live stream, undergoes some processing along the way; as an example, web pages and videos are transcoded to fit each device's screen. The recent multi-access edge computing (MEC) paradigm envisions performing such processing within the cellular network, as opposed to resorting to a cloud server on the Internet. Designing a MEC network…
▽ More
All the content consumed by mobile users, be it a web page or a live stream, undergoes some processing along the way; as an example, web pages and videos are transcoded to fit each device's screen. The recent multi-access edge computing (MEC) paradigm envisions performing such processing within the cellular network, as opposed to resorting to a cloud server on the Internet. Designing a MEC network, i.e., placing and dimensioning the computational facilities therein, requires information on how much computational power is required to produce the contents needed by the users. However, real-world demand traces only contain information on how much data is downloaded. In this paper, we demonstrate how to {\em enrich} demand traces with information about the computational power needed to process the different types of content, and we show the substantial benefit that can be obtained from using such enriched traces for the design of MEC-based networks.
△ Less
Submitted 23 September, 2018;
originally announced September 2018.
-
Revisiting the Challenges of MaxClique
Authors:
Raffaele Marino,
Scott Kirkpatrick
Abstract:
The MaxClique problem, finding the largest complete subgraph in an Erd{ö}s-R{é}nyi $G(N,p)$ random graph in the large $N$ limit, is a well-known example of a simple problem for which finding any approximate solution within a factor of $2$ of the known, probabilistically determined limit, appears to require P$=$NP. This type of search has practical importance in very large graphs. Algorithmic appro…
▽ More
The MaxClique problem, finding the largest complete subgraph in an Erd{ö}s-R{é}nyi $G(N,p)$ random graph in the large $N$ limit, is a well-known example of a simple problem for which finding any approximate solution within a factor of $2$ of the known, probabilistically determined limit, appears to require P$=$NP. This type of search has practical importance in very large graphs. Algorithmic approaches run into phase boundaries long before they reach the size of the largest likely solutions. And, most intriguing, there is an extensive literature of \textit{challenges} posed for concrete methods of finding maximum naturally occurring as well as artificially hidden cliques, with computational costs that are at most polynomial in the size of the problem.
We use the probabilistic approach in a novel way to provide a more insightful test of constructive algorithms for this problem. We show that extensions of existing methods of greedy local search will be able to meet the \textit{challenges} for practical problems of size $N$ as large as $10^{10}$ and perhaps more. Experiments with spectral methods that treat a single large clique of size $αN^{1/2}$ \textit{planted} in the graph as an impurity level in a tight binding energy band show that such a clique can be detected when $α\geq \approx1.0$. Belief propagation using a recent \textit{approximate message passing} (\textbf{AMP}) scheme of inference pushes this limit down to $α\sim \sqrt{1/e}$. Exhaustive local search (with early stop** when the planted clique is found) does even better on problems of practical size, and proves to be the fastest solution method for this problem.
△ Less
Submitted 8 May, 2019; v1 submitted 24 July, 2018;
originally announced July 2018.
-
Mining the Air -- for Research in Social Science and Networking Measurement
Authors:
Scott Kirkpatrick,
Ron Bekkerman,
Adi Zmirli,
Francesco Malandrino
Abstract:
Smartphone apps provide a vitally important opportunity for monitoring human mobility, human experience of ubiquitous information aids, and human activity in our increasingly well-instrumented spaces. As wireless data capabilities move steadily up in performance, from 2&3G to 4G (today's LTE) and 5G, it has become more important to measure human activity in this connected world from the phones the…
▽ More
Smartphone apps provide a vitally important opportunity for monitoring human mobility, human experience of ubiquitous information aids, and human activity in our increasingly well-instrumented spaces. As wireless data capabilities move steadily up in performance, from 2&3G to 4G (today's LTE) and 5G, it has become more important to measure human activity in this connected world from the phones themselves. The newer protocols serve larger areas than ever before and a wider range of data, not just voice calls, so only the phone can accurately measure its location. Access to the application activity permits not only monitoring the performance and spatial coverage with which the users are served, but as a crowd-sourced, unbiased background source of input on all these subjects, becomes a uniquely valuable resource for input to social science and government as well as telecom providers
△ Less
Submitted 19 June, 2018;
originally announced June 2018.
-
Cellular Network Traces Towards 5G: Usage, Analysis and Generation
Authors:
Francesco Malandrino,
Carla-Fabiana Chiasserini,
Scott Kirkpatrick
Abstract:
Deployment and demand traces are a crucial tool to study today's LTE systems, as well as their evolution toward 5G. In this paper, we use a set of real-world, crowdsourced traces, coming from the WeFi and OpenSignal apps, to investigate how present-day networks are deployed, and the load they serve. Given this information, we present a way to generate synthetic deployment and demand profiles, reta…
▽ More
Deployment and demand traces are a crucial tool to study today's LTE systems, as well as their evolution toward 5G. In this paper, we use a set of real-world, crowdsourced traces, coming from the WeFi and OpenSignal apps, to investigate how present-day networks are deployed, and the load they serve. Given this information, we present a way to generate synthetic deployment and demand profiles, retaining the same features of their real-world counterparts. We further discuss a methodology using traces (both real-world and synthetic) to assess (i) to which extent the current deployment is adequate to the current and future demand, and (ii) the effectiveness of the existing strategies to improve network capacity. Applying our methodology to real-world traces, we find that present-day LTE deployments consist of multiple, entangled, medium- to large-sized cells. Furthermore, although today's LTE networks are overprovisioned when compared to the present traffic demand, they will need substantial capacity improvements in order to face the load increase forecasted between now and 2020.
△ Less
Submitted 14 April, 2018;
originally announced April 2018.
-
How Close to the Edge? Delay/utilization tradeoffs in MEC
Authors:
Francesco Malandrino,
Scott Kirkpatrick,
Carla-Fabiana Chiasserini
Abstract:
Virtually all of the rapidly increasing data traffic consumed by mobile users requires some kind of processing, normally performed at cloud servers. A recent thrust, {\em mobile edge computing}, moves such processing to servers {\em within} the cellular mobile network. The large temporal and spatial variations to which mobile data usage is subject could make the reduced latency that edge clouds of…
▽ More
Virtually all of the rapidly increasing data traffic consumed by mobile users requires some kind of processing, normally performed at cloud servers. A recent thrust, {\em mobile edge computing}, moves such processing to servers {\em within} the cellular mobile network. The large temporal and spatial variations to which mobile data usage is subject could make the reduced latency that edge clouds offer come at an unacceptable cost in redundant and underutilized infrastructure. We present some first empirical results on this question, based on large scale sampled crowd-sourced traces from several major cities spanning multiple operators and identifying the applications in use. We find opportunities to obtain both high server utilization and low application latency, but the best approaches will depend on the individual network operator's deployment strategy and geographic specifics of the cities we study.
△ Less
Submitted 25 November, 2016;
originally announced November 2016.
-
The Impact of Vehicular Traffic Demand on 5G Caching Architectures: a Data-Driven Study
Authors:
Francesco Malandrino,
Carla-Fabiana Chiasserini,
Scott Kirkpatrick
Abstract:
The emergence of in-vehicle entertainment systems and self-driving vehicles, and the latter's need for high-resolution, up-to-date maps, will bring a further increase in the amount of data vehicles consume. Considering how difficult WiFi offloading in vehicular environments is, the bulk of this additional load will be served by cellular networks. Cellular networks, in turn, will resort to caching…
▽ More
The emergence of in-vehicle entertainment systems and self-driving vehicles, and the latter's need for high-resolution, up-to-date maps, will bring a further increase in the amount of data vehicles consume. Considering how difficult WiFi offloading in vehicular environments is, the bulk of this additional load will be served by cellular networks. Cellular networks, in turn, will resort to caching at the network edge in order to reduce the strain on their core network, an approach also known as mobile edge computing, or fog computing. In this work, we exploit a real-world, large-scale trace coming from the users of the We-Fi app in order to (i) understand how significant the contribution of vehicular users is to the global traffic demand; (ii) compare the performance of different caching architectures; and (iii) studying how such a performance is influenced by recommendation systems and content locality. We express the price of fog computing through a metric called price-of-fog, accounting for the extra caches to deploy compared to a traditional, centralized approach. We find that fog computing allows a very significant reduction of the load on the core network, and the price thereof is low in all cases and becomes negligible if content demand is location specific. We can therefore conclude that vehicular networks make an excellent case for the transition to mobile-edge caching: thanks to the peculiar features of vehicular demand, we can obtain all the benefits of fog computing, including a reduction of the load on the core network, reducing the disadvantages to a minimum.
△ Less
Submitted 24 November, 2016;
originally announced November 2016.
-
What is LTE actually used for? An answer through multi-operator, crowd-sourced measurement
Authors:
Francesco Malandrino,
Scott Kirkpatrick,
Danny Bickson
Abstract:
LTE networks are commonplace nowadays; however, comparatively little is known about where (and why) they are deployed, and the demand they serve. We shed some light on these issues through large-scale, crowd-sourced measurement. Our data, collected by users of the WeFi app, spans multiple operators and multiple cities, allowing us to observe a wide variety of deployment patterns. Surprisingly, we…
▽ More
LTE networks are commonplace nowadays; however, comparatively little is known about where (and why) they are deployed, and the demand they serve. We shed some light on these issues through large-scale, crowd-sourced measurement. Our data, collected by users of the WeFi app, spans multiple operators and multiple cities, allowing us to observe a wide variety of deployment patterns. Surprisingly, we find that LTE is frequently used to improve the {\em coverage} of network rather than the capacity thereof, and that no evidence shows that video traffic be a primary driver for its deployment. Our insights suggest that such factors as pre-existing networks and commercial policies have a deeper impact on deployment decisions than purely technical considerations.
△ Less
Submitted 23 November, 2016;
originally announced November 2016.
-
The Price of Fog: a Data-Driven Study on Caching Architectures in Vehicular Networks
Authors:
Francesco Malandrino,
Carla-Fabiana Chiasserini,
Scott Kirkpatrick
Abstract:
Vehicular users are expected to consume large amounts of data, for both entertainment and navigation purposes. This will put a strain on cellular networks, which will be able to cope with such a load only if proper caching is in place, this in turn begs the question of which caching architecture is the best-suited to deal with vehicular content consumption. In this paper, we leverage a large-scale…
▽ More
Vehicular users are expected to consume large amounts of data, for both entertainment and navigation purposes. This will put a strain on cellular networks, which will be able to cope with such a load only if proper caching is in place, this in turn begs the question of which caching architecture is the best-suited to deal with vehicular content consumption. In this paper, we leverage a large-scale, crowd-collected trace to (i) characterize the vehicular traffic demand, in terms of overall magnitude and content breakup, (ii) assess how different caching approaches perform against such a real-world load, (iii) study the effect of recommendation systems and local contents. We define a price-of-fog metric, expressing the additional caching capacity to deploy when moving from traditional, centralized caching architectures to a "fog computing" approach, where caches are closer to the network edge. We find that for location-specific contents, such as the ones that vehicular users are most likely to request, such a price almost disappears. Vehicular networks thus make a strong case for the adoption of mobile-edge caching, as we are able to reap the benefit thereof -- including a reduction in the distance traveled by data, within the core network -- with little or no of the associated disadvantages.
△ Less
Submitted 20 May, 2016;
originally announced May 2016.
-
Social Networks and Spin Glasses
Authors:
Scott Kirkpatrick,
Alex Kulakovsky,
Manuel Cebrian,
Alex Pentland
Abstract:
The networks formed from the links between telephones observed in a month's call detail records (CDRs) in the UK are analyzed, looking for the characteristics thought to identify a communications network or a social network. Some novel methods are employed. We find similarities to both types of network. We conclude that, just as analogies to spin glasses have proved fruitful for optimization of la…
▽ More
The networks formed from the links between telephones observed in a month's call detail records (CDRs) in the UK are analyzed, looking for the characteristics thought to identify a communications network or a social network. Some novel methods are employed. We find similarities to both types of network. We conclude that, just as analogies to spin glasses have proved fruitful for optimization of large scale practical problems, there will be opportunities to exploit a statistical mechanics of the formation and dynamics of social networks in today's electronically connected world.
△ Less
Submitted 31 October, 2011; v1 submitted 7 August, 2010;
originally announced August 2010.
-
New Model of Internet Topology Using k-shell Decomposition
Authors:
Shai Carmi,
Shlomo Havlin,
Scott Kirkpatrick,
Yuval Shavitt,
Eran Shir
Abstract:
We introduce and use k-shell decomposition to investigate the topology of the Internet at the AS level. Our analysis separates the Internet into three sub-components: (a) a nucleus which is a small (~100 nodes) very well connected globally distributed subgraph; (b) a fractal sub-component that is able to connect the bulk of the Internet without congesting the nucleus, with self similar propertie…
▽ More
We introduce and use k-shell decomposition to investigate the topology of the Internet at the AS level. Our analysis separates the Internet into three sub-components: (a) a nucleus which is a small (~100 nodes) very well connected globally distributed subgraph; (b) a fractal sub-component that is able to connect the bulk of the Internet without congesting the nucleus, with self similar properties and critical exponents; and (c) dendrite-like structures, usually isolated nodes that are connected to the rest of the network through the nucleus only. This unique decomposition is robust, and provides insight into the underlying structure of the Internet and its functional consequences. Our approach is general and useful also when studying other complex networks.
△ Less
Submitted 17 July, 2006;
originally announced July 2006.
-
MEDUSA - New Model of Internet Topology Using k-shell Decomposition
Authors:
Shai Carmi,
Shlomo Havlin,
Scott Kirkpatrick,
Yuval Shavitt,
Eran Shir
Abstract:
The k-shell decomposition of a random graph provides a different and more insightful separation of the roles of the different nodes in such a graph than does the usual analysis in terms of node degrees. We develop this approach in order to analyze the Internet's structure at a coarse level, that of the "Autonomous Systems" or ASes, the subnetworks out of which the Internet is assembled. We emplo…
▽ More
The k-shell decomposition of a random graph provides a different and more insightful separation of the roles of the different nodes in such a graph than does the usual analysis in terms of node degrees. We develop this approach in order to analyze the Internet's structure at a coarse level, that of the "Autonomous Systems" or ASes, the subnetworks out of which the Internet is assembled. We employ new data from DIMES (see http://www.netdimes.org), a distributed agent-based map** effort which at present has attracted over 3800 volunteers running more than 7300 DIMES clients in over 85 countries. We combine this data with the AS graph information available from the RouteViews project at Univ. Oregon, and have obtained an Internet map with far more detail than any previous effort.
The data suggests a new picture of the AS-graph structure, which distinguishes a relatively large, redundantly connected core of nearly 100 ASes and two components that flow data in and out from this core. One component is fractally interconnected through peer links; the second makes direct connections to the core only. The model which results has superficial similarities with and important differences from the "Jellyfish" structure proposed by Tauro et al., so we call it a "Medusa." We plan to use this picture as a framework for measuring and extrapolating changes in the Internet's physical structure. Our k-shell analysis may also be relevant for estimating the function of nodes in the "scale-free" graphs extracted from other naturally-occurring processes.
△ Less
Submitted 11 January, 2006;
originally announced January 2006.
-
Selfish vs. Unselfish Optimization of Network Creation
Authors:
Johannes J. Schneider,
Scott Kirkpatrick
Abstract:
We investigate several variants of a network creation model: a group of agents builds up a network between them while trying to keep the costs of this network small. The cost function consists of two addends, namely (i) a constant amount for each edge an agent buys and (ii) the minimum number of hops it takes sending messages to other agents. Despite the simplicity of this model, various complex…
▽ More
We investigate several variants of a network creation model: a group of agents builds up a network between them while trying to keep the costs of this network small. The cost function consists of two addends, namely (i) a constant amount for each edge an agent buys and (ii) the minimum number of hops it takes sending messages to other agents. Despite the simplicity of this model, various complex network structures emerge depending on the weight between the two addends of the cost function and on the selfish or unselfish behaviour of the agents.
△ Less
Submitted 3 August, 2005;
originally announced August 2005.
-
Comparing Beliefs, Surveys and Random Walks
Authors:
Erik Aurell,
Uri Gordon,
Scott Kirkpatrick
Abstract:
Survey propagation is a powerful technique from statistical physics that has been applied to solve the 3-SAT problem both in principle and in practice. We give, using only probability arguments, a common derivation of survey propagation, belief propagation and several interesting hybrid methods. We then present numerical experiments which use WSAT (a widely used random-walk based SAT solver) to…
▽ More
Survey propagation is a powerful technique from statistical physics that has been applied to solve the 3-SAT problem both in principle and in practice. We give, using only probability arguments, a common derivation of survey propagation, belief propagation and several interesting hybrid methods. We then present numerical experiments which use WSAT (a widely used random-walk based SAT solver) to quantify the complexity of the 3-SAT formulae as a function of their parameters, both as randomly generated and after simplification, guided by survey propagation. Some properties of WSAT which have not previously been reported make it an ideal tool for this purpose -- its mean cost is proportional to the number of variables in the formula (at a fixed ratio of clauses to variables) in the easy-SAT regime and slightly beyond, and its behavior in the hard-SAT regime appears to reflect the underlying structure of the solution space that has been predicted by replica symmetry-breaking arguments. An analysis of the tradeoffs between the various methods of search for satisfying assignments shows WSAT to be far more powerful that has been appreciated, and suggests some interesting new directions for practical algorithm development.
△ Less
Submitted 11 January, 2005; v1 submitted 9 June, 2004;
originally announced June 2004.
-
2+p-SAT: Relation of Typical-Case Complexity to the Nature of the Phase Transition
Authors:
R. Monasson,
R. Zecchina,
S. Kirkpatrick,
B. Selman,
L. Troyansky
Abstract:
Heuristic methods for solution of problems in the NP-Complete class of decision problems often reach exact solutions, but fail badly at "phase boundaries", across which the decision to be reached changes from almost always having one value to almost having a different value. We report an analytic solution and experimental investigations of the phase transition that occurs in the limit of very la…
▽ More
Heuristic methods for solution of problems in the NP-Complete class of decision problems often reach exact solutions, but fail badly at "phase boundaries", across which the decision to be reached changes from almost always having one value to almost having a different value. We report an analytic solution and experimental investigations of the phase transition that occurs in the limit of very large problems in K-SAT. The nature of its "random first-order" phase transition, seen at values of K large enough to make the computational cost of solving typical instances increase exponenitally with problem size, suggest a mechanism for the cost increase. There has been evidence for features like the "backbone" of frozen inputs which characterizes the UNSAT phase in K-SAT in the study of models of disordered materials, but this feature and this transition are uniquely accessible to analysis in K-SAT. The random first order transition combines properties of the 1st order (discontinuous onset of order) and 2nd order (with power law scaling, e.g. of the width of the the critical region in a finite system) transitions known in the physics of pure solids. Such transitions should occur in other combinatoric problems in the large N limit. Finally, improved search heuristics may be developed when a "backbone" is known to exist.
△ Less
Submitted 6 October, 1999;
originally announced October 1999.