-
INDUS: Effective and Efficient Language Models for Scientific Applications
Authors:
Bishwaranjan Bhattacharjee,
Aashka Trivedi,
Masayasu Muraoka,
Muthukumaran Ramasubramanian,
Takuma Udagawa,
Iksha Gurung,
Rong Zhang,
Bharath Dandala,
Rahul Ramachandran,
Manil Maskey,
Kaylin Bugbee,
Mike Little,
Elizabeth Fancher,
Lauren Sanders,
Sylvain Costes,
Sergi Blanco-Cuaresma,
Kelly Lockhart,
Thomas Allen,
Felix Grezes,
Megan Ansdell,
Alberto Accomazzi,
Yousef El-Kurdi,
Davis Wertheimer,
Birgit Pfitzmann,
Cesar Berrospi Ramis
, et al. (9 additional authors not shown)
Abstract:
Large language models (LLMs) trained on general domain corpora showed remarkable results on natural language processing (NLP) tasks. However, previous research demonstrated LLMs trained using domain-focused corpora perform better on specialized tasks. Inspired by this pivotal insight, we developed INDUS, a comprehensive suite of LLMs tailored for the Earth science, biology, physics, heliophysics,…
▽ More
Large language models (LLMs) trained on general domain corpora showed remarkable results on natural language processing (NLP) tasks. However, previous research demonstrated LLMs trained using domain-focused corpora perform better on specialized tasks. Inspired by this pivotal insight, we developed INDUS, a comprehensive suite of LLMs tailored for the Earth science, biology, physics, heliophysics, planetary sciences and astrophysics domains and trained using curated scientific corpora drawn from diverse data sources. The suite of models include: (1) an encoder model trained using domain-specific vocabulary and corpora to address natural language understanding tasks, (2) a contrastive-learning-based general text embedding model trained using a diverse set of datasets drawn from multiple sources to address information retrieval tasks and (3) smaller versions of these models created using knowledge distillation techniques to address applications which have latency or resource constraints. We also created three new scientific benchmark datasets namely, CLIMATE-CHANGE-NER (entity-recognition), NASA-QA (extractive QA) and NASA-IR (IR) to accelerate research in these multi-disciplinary fields. Finally, we show that our models outperform both general-purpose encoders (RoBERTa) and existing domain-specific encoders (SciBERT) on these new tasks as well as existing benchmark tasks in the domains of interest.
△ Less
Submitted 20 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
RL-Based Hyperparameter Selection for Spectrum Sensing With CNNs
Authors:
Amir Mehrabian,
Maryam Sabbaghian,
Halim Yanikomeroglu
Abstract:
Selection of hyperparameters in deep neural networks is a challenging problem due to the wide search space and emergence of various layers with specific hyperparameters. There exists an absence of consideration for the neural architecture selection of convolutional neural networks (CNNs) for spectrum sensing. Here, we develop a method using reinforcement learning and Q-learning to systematically s…
▽ More
Selection of hyperparameters in deep neural networks is a challenging problem due to the wide search space and emergence of various layers with specific hyperparameters. There exists an absence of consideration for the neural architecture selection of convolutional neural networks (CNNs) for spectrum sensing. Here, we develop a method using reinforcement learning and Q-learning to systematically search and evaluate various architectures for generated datasets including different signals and channels in the spectrum sensing problem. We show by extensive simulations that CNN-based detectors proposed by our developed method outperform several detectors in the literature. For the most complex dataset, the proposed approach provides 9% enhancement in accuracy at the cost of higher computational complexity. Furthermore, a novel method using multi-armed bandit model for selection of the sensing time is proposed to achieve higher throughput and accuracy while minimizing the consumed energy. The method dynamically adjusts the sensing time under the time-varying condition of the channel without prior information. We demonstrate through a simulated scenario that the proposed method improves the achieved reward by about 20% compared to the conventional policies. Consequently, this study effectively manages the selection of important hyperparameters for CNN-based detectors offering superior performance of cognitive radio network.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Energy Sustainability in Dense Radio Access Networks via High Altitude Platform Stations
Authors:
Maryam Salamatmoghadasi,
Amir Mehrabian,
Halim Yanikomeroglu
Abstract:
The growing demand for radio access networks (RANs) driven by advanced wireless technology and the everincreasing mobile traffic, faces significant energy consumption challenges that threaten sustainability. To address this, an architecture referring to the vertical heterogeneous network (vHetNet) has recently been proposed. Our study seeks to enhance network operations in terms of energy efficien…
▽ More
The growing demand for radio access networks (RANs) driven by advanced wireless technology and the everincreasing mobile traffic, faces significant energy consumption challenges that threaten sustainability. To address this, an architecture referring to the vertical heterogeneous network (vHetNet) has recently been proposed. Our study seeks to enhance network operations in terms of energy efficiency and sustainability by examining a vHetNet configuration, comprising a high altitude platform station (HAPS) acting as a super macro base station (SMBS), along with a macro base station (MBS) and a set of small base stations (SBSs) in a densely populated area.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search
Authors:
Abbas Mehrabian,
Ankit Anand,
Hyunjik Kim,
Nicolas Sonnerat,
Matej Balog,
Gheorghe Comanici,
Tudor Berariu,
Andrew Lee,
Anian Ruoss,
Anna Bulanova,
Daniel Toyama,
Sam Blackwell,
Bernardino Romera Paredes,
Petar Veličković,
Laurent Orseau,
Joonkyung Lee,
Anurag Murty Naredla,
Doina Precup,
Adam Zsolt Wagner
Abstract:
This work studies a central extremal graph theory problem inspired by a 1975 conjecture of Erdős, which aims to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this problem as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method…
▽ More
This work studies a central extremal graph theory problem inspired by a 1975 conjecture of Erdős, which aims to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this problem as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method. Using either method, by introducing a curriculum -- jump-starting the search for larger graphs using good graphs found at smaller sizes -- we improve the state-of-the-art lower bounds for several sizes. We also propose a flexible graph-generation environment and a permutation-invariant network architecture for learning to search in the space of graphs.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
A Design Methodology for Post-Moore's Law Accelerators: The Case of a Photonic Neuromorphic Processor
Authors:
Armin Mehrabian,
Volker J. Sorger,
Tarek El-Ghazawi
Abstract:
Over the past decade alternative technologies have gained momentum as conventional digital electronics continue to approach their limitations, due to the end of Moore's Law and Dennard Scaling. At the same time, we are facing new application challenges such as those due to the enormous increase in data. The attention, has therefore, shifted from homogeneous computing to specialized heterogeneous s…
▽ More
Over the past decade alternative technologies have gained momentum as conventional digital electronics continue to approach their limitations, due to the end of Moore's Law and Dennard Scaling. At the same time, we are facing new application challenges such as those due to the enormous increase in data. The attention, has therefore, shifted from homogeneous computing to specialized heterogeneous solutions. As an example, brain-inspired computing has re-emerged as a viable solution for many applications. Such new processors, however, have widened the abstraction gamut from device level to applications. Therefore, efficient abstractions that can provide vertical design-flow tools for such technologies became critical. Photonics in general, and neuromorphic photonics in particular, are among the promising alternatives to electronics. While the arsenal of device level toolbox for photonics, and high-level neural network platforms are rapidly expanding, there has not been much work to bridge this gap. Here, we present a design methodology to mitigate this problem by extending high-level hardware-agnostic neural network design tools with functional and performance models of photonic components. In this paper we detail this tool and methodology by using design examples and associated results. We show that adopting this approach enables designers to efficiently navigate the design space and devise hardware-aware systems with alternative technologies.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
A New Approach for Macroscopic Analysis to Improve the Technical and Economic Impacts of Urban Interchanges on Traffic Networks
Authors:
Seyed Hassan Hosseini,
Ahmad Mehrabian,
Zhila Dehdari Ebrahimi,
Mohsen Momenitabar,
Mohammad Arani
Abstract:
Pursuing three important elements including economic, safety, and traffic are the overall objective of decision evaluation across all transport projects. In this study, we investigate the feasibility of the development of city interchanges and road connections for network users. To achieve this goal, a series of minor goals are required to be met in advance including determining benefits, costs of…
▽ More
Pursuing three important elements including economic, safety, and traffic are the overall objective of decision evaluation across all transport projects. In this study, we investigate the feasibility of the development of city interchanges and road connections for network users. To achieve this goal, a series of minor goals are required to be met in advance including determining benefits, costs of implement-ing new highway interchanges, quantifying the effective parameters, the increase in fuel consumption, the reduction in travel time, and finally influence on travel speed. In this study, geometric advancement of Hakim highway, and Yadegar-e-Emam Highway were investigated in the Macro view from the cloverleaf inter-section with a low capacity to a three-level directional intersection of the enhanced cloverleaf. For this purpose, the simulation was done by EMME software of INRO Company. The results of the method were evaluated by the objective of net present value (NPV), and the benefit and cost of each one was stated precisely in different years. At the end, some suggestion has been provided.
△ Less
Submitted 17 September, 2020; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Artificial Synapse with Mnemonic Functionality using GSST-based Photonic Integrated Memory
Authors:
Mario Miscuglio,
Jiawei Meng,
Omer Yesiliurt,
Yifei Zhang,
Ludmila J. Prokopeva,
Armin Mehrabian,
Juejun Hu,
Alexander V. Kildishev,
Volker J. Sorger
Abstract:
Machine-learning tasks performed by neural networks demonstrated useful capabilities for producing reliable, and repeatable intelligent decisions. Integrated photonics, leveraging both component miniaturization and the wave-nature of the signals, can potentially outperform electronics architectures when performing inference tasks. However, the missing photon-photon force challenges non-volatile ph…
▽ More
Machine-learning tasks performed by neural networks demonstrated useful capabilities for producing reliable, and repeatable intelligent decisions. Integrated photonics, leveraging both component miniaturization and the wave-nature of the signals, can potentially outperform electronics architectures when performing inference tasks. However, the missing photon-photon force challenges non-volatile photonic device-functionality required for efficient neural networks. Here we present a novel concept and optimization of multi-level discrete-state non-volatile photonic memory based on an ultra-compact (<4um) hybrid phase change material GSST-silicon Mach Zehnder modulator, with low insertion losses (3dB), to serve as node in a photonic neural network. An optimized electro-thermal switching mechanism, induced by Joule heating through tungsten contacts, is engineered. This operation allows to change the phase of the GSST film thus providing weight updating functionality to the network. We show that a 5 V pulse-train (<1 us, 20 pulses) applied to a serpentine contact produces crystallization and a single pulse of longer duration (2 us) amorphization, used to set the analog synaptic weights of a neuron. Emulating an opportunely trained 100x100 fully connected multilayered perceptron neural network with this weighting functionality embedded as photonic memory, shows up to 93% inference accuracy and robustness towards noise when performing predictions of unseen data
△ Less
Submitted 22 December, 2019; v1 submitted 4 November, 2019;
originally announced December 2019.
-
Regret Bounds for Batched Bandits
Authors:
Hossein Esfandiari,
Amin Karbasi,
Abbas Mehrabian,
Vahab Mirrokni
Abstract:
We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number of batches. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also study th…
▽ More
We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number of batches. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also study the batched adversarial multi-armed bandit problem for the first time and find the optimal regret, up to logarithmic factors, of any algorithm with predetermined batch sizes.
△ Less
Submitted 18 February, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems
Authors:
Sharan Vaswani,
Abbas Mehrabian,
Audrey Durand,
Branislav Kveton
Abstract:
We propose $\tt RandUCB$, a bandit strategy that builds on theoretically derived confidence intervals similar to upper confidence bound (UCB) algorithms, but akin to Thompson sampling (TS), it uses randomization to trade off exploration and exploitation. In the $K$-armed bandit setting, we show that there are infinitely many variants of $\tt RandUCB$, all of which achieve the minimax-optimal…
▽ More
We propose $\tt RandUCB$, a bandit strategy that builds on theoretically derived confidence intervals similar to upper confidence bound (UCB) algorithms, but akin to Thompson sampling (TS), it uses randomization to trade off exploration and exploitation. In the $K$-armed bandit setting, we show that there are infinitely many variants of $\tt RandUCB$, all of which achieve the minimax-optimal $\widetilde{O}(\sqrt{K T})$ regret after $T$ rounds. Moreover, for a specific multi-armed bandit setting, we show that both UCB and TS can be recovered as special cases of $\tt RandUCB$. For structured bandits, where each arm is associated with a $d$-dimensional feature vector and rewards are distributed according to a linear or generalized linear model, we prove that $\tt RandUCB$ achieves the minimax-optimal $\widetilde{O}(d \sqrt{T})$ regret even in the case of infinitely many arms. Through experiments in both the multi-armed and structured bandit settings, we demonstrate that $\tt RandUCB$ matches or outperforms TS and other randomized exploration strategies. Our theoretical and empirical results together imply that $\tt RandUCB$ achieves the best of both worlds.
△ Less
Submitted 22 March, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks
Authors:
Armin Mehrabian,
Mario Miscuglio,
Yousra Alkabani,
Volker J. Sorger,
Tarek El-Ghazawi
Abstract:
Neural Networks (NNs) have become the mainstream technology in the artificial intelligence (AI) renaissance over the past decade. Among different types of neural networks, convolutional neural networks (CNNs) have been widely adopted as they have achieved leading results in many fields such as computer vision and speech recognition. This success in part is due to the widespread availability of cap…
▽ More
Neural Networks (NNs) have become the mainstream technology in the artificial intelligence (AI) renaissance over the past decade. Among different types of neural networks, convolutional neural networks (CNNs) have been widely adopted as they have achieved leading results in many fields such as computer vision and speech recognition. This success in part is due to the widespread availability of capable underlying hardware platforms. Applications have always been a driving factor for design of such hardware architectures. Hardware specialization can expose us to novel architectural solutions, which can outperform general purpose computers for tasks at hand. Although different applications demand for different performance measures, they all share speed and energy efficiency as high priorities. Meanwhile, photonics processing has seen a resurgence due to its inherited high speed and low power nature. Here, we investigate the potential of using photonics in CNNs by proposing a CNN accelerator design based on Winograd filtering algorithm. Our evaluation results show that while a photonic accelerator can compete with current-state-of-the-art electronic platforms in terms of both speed and power, it has the potential to improve the energy efficiency by up to three orders of magnitude.
△ Less
Submitted 4 December, 2019; v1 submitted 25 June, 2019;
originally announced June 2019.
-
A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players
Authors:
Etienne Boursier,
Emilie Kaufmann,
Abbas Mehrabian,
Vianney Perchet
Abstract:
We study a multiplayer stochastic multi-armed bandit problem in which players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider the challenging heterogeneous setting, in which different arms may have different means for different players, and propose a new and efficient algorithm that combines the idea of…
▽ More
We study a multiplayer stochastic multi-armed bandit problem in which players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider the challenging heterogeneous setting, in which different arms may have different means for different players, and propose a new and efficient algorithm that combines the idea of leveraging forced collisions for implicit communication and that of performing matching eliminations. We present a finite-time analysis of our algorithm, giving the first sublinear minimax regret bound for this problem, and prove that if the optimal assignment of players to arms is unique, our algorithm attains the optimal $O(\ln(T))$ regret, solving an open question raised at NeurIPS 2018.
△ Less
Submitted 20 March, 2020; v1 submitted 4 February, 2019;
originally announced February 2019.
-
The total variation distance between high-dimensional Gaussians with the same mean
Authors:
Luc Devroye,
Abbas Mehrabian,
Tommy Reddad
Abstract:
Given two high-dimensional Gaussians with the same mean, we prove a lower and an upper bound for their total variation distance, which are within a constant factor of one another.
Given two high-dimensional Gaussians with the same mean, we prove a lower and an upper bound for their total variation distance, which are within a constant factor of one another.
△ Less
Submitted 22 October, 2023; v1 submitted 19 October, 2018;
originally announced October 2018.
-
All-optical Nonlinear Activation Function for Photonic Neural Networks
Authors:
Mario Miscuglio,
Armin Mehrabian,
Zibo Hu,
Shaimaa I. Azzam,
Jonathan K. George,
Alexander V. Kildishev,
Matthew Pelton,
Volker J. Sorger
Abstract:
With the recent successes of neural networks (NN) to perform machine-learning tasks, photonic-based NN designs may enable high throughput and low power neuromorphic compute paradigms since they bypass the parasitic charging of capacitive wires. Thus, engineering data-information processors capable of executing NN algorithms with high efficiency is of major importance for applications ranging from…
▽ More
With the recent successes of neural networks (NN) to perform machine-learning tasks, photonic-based NN designs may enable high throughput and low power neuromorphic compute paradigms since they bypass the parasitic charging of capacitive wires. Thus, engineering data-information processors capable of executing NN algorithms with high efficiency is of major importance for applications ranging from pattern recognition to classification. Our hypothesis is therefore, that if the time-limiting electro-optic conversion of current photonic NN designs could be postponed until the very end of the network, then the execution time of the photonic algorithm is simple the delay of the time-of-flight of photons through the NN, which is on the order of picoseconds for integrated photonics. Exploring such all-optical NN, in this work we discuss two independent approaches of implementing the optical perceptrons nonlinear activation function based on nanophotonic structures exhibiting i) induced transparency and ii) reverse saturated absorption. Our results show that the all-optical nonlinearity provides about 3 and 7 dB extinction ratio for the two systems considered, respectively, and classification accuracies of an exemplary MNIST task of 97% and near 100% are found, which rivals that of software based trained NNs, yet with ignored noise in the network. Together with a developed concept for an all-optical perceptron, these findings point to the possibility of realizing pure photonic NNs with potentially unmatched throughput and even energy consumption for next generation information processing hardware.
△ Less
Submitted 6 October, 2018; v1 submitted 2 October, 2018;
originally announced October 2018.
-
Neuromorphic photonics with electro-absorption modulators
Authors:
Jonathan George,
Armin Mehrabian,
Rubab Amin,
Jiawei Meng,
Thomas Ferreira de Lima,
Alexander N. Tait,
Bhavin J. Shastri,
Tarek El-Ghazawi,
Paul R. Prucnal,
Volker J. Sorger
Abstract:
Photonic neural networks benefit from both the high channel capacity- and the wave nature of light acting as an effective weighting mechanism through linear optics. The neuron's activation function, however, requires nonlinearity which can be achieved either through nonlinear optics or electro-optics. Nonlinear optics, while potentially faster, is challenging at low optical power. With electro-opt…
▽ More
Photonic neural networks benefit from both the high channel capacity- and the wave nature of light acting as an effective weighting mechanism through linear optics. The neuron's activation function, however, requires nonlinearity which can be achieved either through nonlinear optics or electro-optics. Nonlinear optics, while potentially faster, is challenging at low optical power. With electro-optics, a photodiode integrating the weighted products of a photonic perceptron can be paired directly to a modulator, which creates a nonlinear transfer function for efficient operating. Here we model the activation functions of five types of electro-absorption modulators, analyze their individual performance over varying performance parameters, and simulate their combined effect on the inference of the neural network
△ Less
Submitted 31 July, 2018;
originally announced September 2018.
-
Multiplayer bandits without observing collision information
Authors:
Gabor Lugosi,
Abbas Mehrabian
Abstract:
We study multiplayer stochastic multi-armed bandit problems in which the players cannot communicate and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup when no collision information is available. We give the…
▽ More
We study multiplayer stochastic multi-armed bandit problems in which the players cannot communicate and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup when no collision information is available. We give the first theoretical guarantees for the second model: an algorithm with a logarithmic regret, and an algorithm with a square-root regret type that does not depend on the gaps between the means. For the first model, we give the first square-root regret bounds that do not depend on the gaps. Building on these ideas, we also give an algorithm for reaching approximate Nash equilibria quickly in stochastic anti-coordination games.
△ Less
Submitted 4 April, 2021; v1 submitted 25 August, 2018;
originally announced August 2018.
-
PCNNA: A Photonic Convolutional Neural Network Accelerator
Authors:
Armin Mehrabian,
Yousra Al-Kabani,
Volker J Sorger,
Tarek El-Ghazawi
Abstract:
Convolutional Neural Networks (CNN) have been the centerpiece of many applications including but not limited to computer vision, speech processing, and Natural Language Processing (NLP). However, the computationally expensive convolution operations impose many challenges to the performance and scalability of CNNs. In parallel, photonic systems, which are traditionally employed for data communicati…
▽ More
Convolutional Neural Networks (CNN) have been the centerpiece of many applications including but not limited to computer vision, speech processing, and Natural Language Processing (NLP). However, the computationally expensive convolution operations impose many challenges to the performance and scalability of CNNs. In parallel, photonic systems, which are traditionally employed for data communication, have enjoyed recent popularity for data processing due to their high bandwidth, low power consumption, and reconfigurability. Here we propose a Photonic Convolutional Neural Network Accelerator (PCNNA) as a proof of concept design to speedup the convolution operation for CNNs. Our design is based on the recently introduced silicon photonic microring weight banks, which use broadcast-and-weight protocol to perform Multiply And Accumulate (MAC) operation and move data through layers of a neural network. Here, we aim to exploit the synergy between the inherent parallelism of photonics in the form of Wavelength Division Multiplexing (WDM) and sparsity of connections between input feature maps and kernels in CNNs. While our full system design offers up to more than 3 orders of magnitude speedup in execution time, its optical core potentially offers more than 5 order of magnitude speedup compared to state-of-the-art electronic counterparts.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
The Minimax Learning Rates of Normal and Ising Undirected Graphical Models
Authors:
Luc Devroye,
Abbas Mehrabian,
Tommy Reddad
Abstract:
Let $G$ be an undirected graph with $m$ edges and $d$ vertices. We show that $d$-dimensional Ising models on $G$ can be learned from $n$ i.i.d. samples within expected total variation distance some constant factor of $\min\{1, \sqrt{(m + d)/n}\}$, and that this rate is optimal. We show that the same rate holds for the class of $d$-dimensional multivariate normal undirected graphical models with re…
▽ More
Let $G$ be an undirected graph with $m$ edges and $d$ vertices. We show that $d$-dimensional Ising models on $G$ can be learned from $n$ i.i.d. samples within expected total variation distance some constant factor of $\min\{1, \sqrt{(m + d)/n}\}$, and that this rate is optimal. We show that the same rate holds for the class of $d$-dimensional multivariate normal undirected graphical models with respect to $G$. We also identify the optimal rate of $\min\{1, \sqrt{m/n}\}$ for Ising models with no external magnetic field.
△ Less
Submitted 3 June, 2020; v1 submitted 18 June, 2018;
originally announced June 2018.
-
Some techniques in density estimation
Authors:
Hassan Ashtiani,
Abbas Mehrabian
Abstract:
Density estimation is an interdisciplinary topic at the intersection of statistics, theoretical computer science and machine learning. We review some old and new techniques for bounding the sample complexity of estimating densities of continuous distributions, focusing on the class of mixtures of Gaussians and its subclasses. In particular, we review the main techniques used to prove the new sampl…
▽ More
Density estimation is an interdisciplinary topic at the intersection of statistics, theoretical computer science and machine learning. We review some old and new techniques for bounding the sample complexity of estimating densities of continuous distributions, focusing on the class of mixtures of Gaussians and its subclasses. In particular, we review the main techniques used to prove the new sample complexity bounds for mixtures of Gaussians by Ashtiani, Ben-David, Harvey, Liaw, Mehrabian, and Plan arXiv:1710.05209.
△ Less
Submitted 22 February, 2018; v1 submitted 11 January, 2018;
originally announced January 2018.
-
Near-optimal Sample Complexity Bounds for Robust Learning of Gaussians Mixtures via Compression Schemes
Authors:
Hassan Ashtiani,
Shai Ben-David,
Nick Harvey,
Christopher Liaw,
Abbas Mehrabian,
Yaniv Plan
Abstract:
We prove that $\tildeΘ(k d^2 / \varepsilon^2)$ samples are necessary and sufficient for learning a mixture of $k$ Gaussians in $\mathbb{R}^d$, up to error $\varepsilon$ in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that $\tilde{O}(k d / \varepsilon^2)$ samples suffice, matching a known lower…
▽ More
We prove that $\tildeΘ(k d^2 / \varepsilon^2)$ samples are necessary and sufficient for learning a mixture of $k$ Gaussians in $\mathbb{R}^d$, up to error $\varepsilon$ in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that $\tilde{O}(k d / \varepsilon^2)$ samples suffice, matching a known lower bound. Moreover, these results hold in the agnostic-learning/robust-estimation setting as well, where the target distribution is only approximately a mixture of Gaussians.
The upper bound is shown using a novel technique for distribution learning based on a notion of `compression.' Any class of distributions that allows such a compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in $\mathbb{R}^d$ admits a small-sized compression scheme.
△ Less
Submitted 21 July, 2020; v1 submitted 14 October, 2017;
originally announced October 2017.
-
D3NOC: Dynamic Data-Driven Network On Chip in Photonic Electronic Hybrids
Authors:
Armin Mehrabian,
Shuai Sun,
Vikram K. Narayana,
Volker J. Sorger,
Tarek El-Ghazawi
Abstract:
In this paper, we present a reconfigurable hybrid Photonic-Plasmonic Network-on-Chip (NoC) based on the Dynamic Data Driven Application System (DDDAS) paradigm. In DDDAS computations and measurements form a dynamic closed feedback loop in which they tune one another in response to changes in the environment. Our proposed system enables dynamic augmentation of a base electrical mesh topology with a…
▽ More
In this paper, we present a reconfigurable hybrid Photonic-Plasmonic Network-on-Chip (NoC) based on the Dynamic Data Driven Application System (DDDAS) paradigm. In DDDAS computations and measurements form a dynamic closed feedback loop in which they tune one another in response to changes in the environment. Our proposed system enables dynamic augmentation of a base electrical mesh topology with an optical express bus during the run-time. In addition, the measurement process itself adjusts to the environment. In order to achieve lower latencies, lower dynamic power, and higher throughput, we take advantage of a Configurable Hybrid Photonic Plasmonic Interconnect (CHyPPI) for our reconfigurable connections. We evaluate the performance and power of our system against kernels from NAS Parallel Benchmark (NPB) in addition to some synthetically generated traffic. In comparison to a 16x16 base electrical mesh, D3NOC shows up to 89% latency and 67% dynamic power net improvements beyond overhead-corrected performance. It should be noted that the design-space of NoC reconfiguration is vast and the goal of this study is not design-space exploration. Our goal is to show the potentials of adaptive dynamic measurements when coupled with other reconfiguration techniques in the NoC context.
△ Less
Submitted 22 August, 2017;
originally announced August 2017.
-
Notes on Growing a Tree in a Graph
Authors:
Luc Devroye,
Vida Dujmović,
Alan Frieze,
Abbas Mehrabian,
Pat Morin,
Bruce Reed
Abstract:
We study the height of a spanning tree $T$ of a graph $G$ obtained by starting with a single vertex of $G$ and repeatedly selecting, uniformly at random, an edge of $G$ with exactly one endpoint in $T$ and adding this edge to $T$.
We study the height of a spanning tree $T$ of a graph $G$ obtained by starting with a single vertex of $G$ and repeatedly selecting, uniformly at random, an edge of $G$ with exactly one endpoint in $T$ and adding this edge to $T$.
△ Less
Submitted 4 July, 2017; v1 submitted 30 June, 2017;
originally announced July 2017.
-
Tight Load Balancing via Randomized Local Search
Authors:
Petra Berenbrink,
Peter Kling,
Christopher Liaw,
Abbas Mehrabian
Abstract:
We consider the following balls-into-bins process with $n$ bins and $m$ balls: each ball is equipped with a mutually independent exponential clock of rate 1. Whenever a ball's clock rings, the ball samples a random bin and moves there if the number of balls in the sampled bin is smaller than in its current bin. This simple process models a typical load balancing problem where users (balls) seek a…
▽ More
We consider the following balls-into-bins process with $n$ bins and $m$ balls: each ball is equipped with a mutually independent exponential clock of rate 1. Whenever a ball's clock rings, the ball samples a random bin and moves there if the number of balls in the sampled bin is smaller than in its current bin. This simple process models a typical load balancing problem where users (balls) seek a selfish improvement of their assignment to resources (bins). From a game theoretic perspective, this is a randomized approach to the well-known Koutsoupias-Papadimitriou model, while it is known as randomized local search (RLS) in load balancing literature. Up to now, the best bound on the expected time to reach perfect balance was $O\left({(\ln n)}^2+\ln(n)\cdot n^2/m\right)$ due to Ganesh, Lilienthal, Manjunath, Proutiere, and Simatos (Load balancing via random local search in closed and open systems, Queueing Systems, 2012). We improve this to an asymptotically tight $O\left(\ln(n)+n^2/m\right)$. Our analysis is based on the crucial observation that performing "destructive moves" (reversals of RLS moves) cannot decrease the balancing time. This allows us to simplify problem instances and to ignore "inconvenient moves" in the analysis.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Sample-Efficient Learning of Mixtures
Authors:
Hassan Ashtiani,
Shai Ben-David,
Abbas Mehrabian
Abstract:
We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let $\mathcal F$ be an arbitrary class of probability distributions, and let $\mathcal{F}^k$ denote the class of $k$-mixtures of elements of…
▽ More
We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let $\mathcal F$ be an arbitrary class of probability distributions, and let $\mathcal{F}^k$ denote the class of $k$-mixtures of elements of $\mathcal F$. Assuming the existence of a method for learning $\mathcal F$ with sample complexity $m_{\mathcal{F}}(ε)$, we provide a method for learning $\mathcal F^k$ with sample complexity $O({k\log k \cdot m_{\mathcal F}(ε) }/{ε^{2}})$. Our mixture learning algorithm has the property that, if the $\mathcal F$-learner is proper/agnostic, then the $\mathcal F^k$-learner would be proper/agnostic as well.
This general result enables us to improve the best known sample complexity upper bounds for a variety of important mixture classes. First, we show that the class of mixtures of $k$ axis-aligned Gaussians in $\mathbb{R}^d$ is PAC-learnable in the agnostic setting with $\widetilde{O}({kd}/{ε^ 4})$ samples, which is tight in $k$ and $d$ up to logarithmic factors. Second, we show that the class of mixtures of $k$ Gaussians in $\mathbb{R}^d$ is PAC-learnable in the agnostic setting with sample complexity $\widetilde{O}({kd^2}/{ε^ 4})$, which improves the previous known bounds of $\widetilde{O}({k^3d^2}/{ε^ 4})$ and $\widetilde{O}(k^4d^4/ε^ 2)$ in its dependence on $k$ and $d$. Finally, we show that the class of mixtures of $k$ log-concave distributions over $\mathbb{R}^d$ is PAC-learnable using $\widetilde{O}(d^{(d+5)/2}ε^{-(d+9)/2}k)$ samples.
△ Less
Submitted 3 June, 2018; v1 submitted 5 June, 2017;
originally announced June 2017.
-
The string of diamonds is nearly tight for rumour spreading
Authors:
Omer Angel,
Abbas Mehrabian,
Yuval Peres
Abstract:
For a rumour spreading protocol, the spread time is defined as the first time that everyone learns the rumour. We compare the synchronous push&pull rumour spreading protocol with its asynchronous variant, and show that for any $n$-vertex graph and any starting vertex, the ratio between their expected spread times is bounded by $O \left({n}^{1/3}{\log^{2/3} n}\right)$. This improves the…
▽ More
For a rumour spreading protocol, the spread time is defined as the first time that everyone learns the rumour. We compare the synchronous push&pull rumour spreading protocol with its asynchronous variant, and show that for any $n$-vertex graph and any starting vertex, the ratio between their expected spread times is bounded by $O \left({n}^{1/3}{\log^{2/3} n}\right)$. This improves the $O(\sqrt n)$ upper bound of Giakkoupis, Nazari, and Woelfel (in Proceedings of ACM Symposium on Principles of Distributed Computing, 2016). Our bound is tight up to a factor of $O(\log n)$, as illustrated by the string of diamonds graph. We also show that if for a pair $α,β$ of real numbers, there exists infinitely many graphs for which the two spread times are $n^α$ and $n^β$ in expectation, then $0\leqα\leq 1$ and $α\leq β\leq \frac13 + \frac23 α$; and we show each such pair $α,β$ is achievable.
△ Less
Submitted 18 July, 2017; v1 submitted 4 April, 2017;
originally announced April 2017.
-
HyPPI NoC: Bringing Hybrid Plasmonics to an Opto-Electronic Network-on-Chip
Authors:
Vikram K. Narayana,
Shuai Sun,
Armin Mehrabian,
Volker J. Sorger,
Tarek El-Ghazawi
Abstract:
As we move towards an era of hundreds of cores, the research community has witnessed the emergence of opto-electronic network on-chip designs based on nanophotonics, in order to achieve higher network throughput, lower latencies, and lower dynamic power. However, traditional nanophotonics options face limitations such as large device footprints compared with electronics, higher static power due to…
▽ More
As we move towards an era of hundreds of cores, the research community has witnessed the emergence of opto-electronic network on-chip designs based on nanophotonics, in order to achieve higher network throughput, lower latencies, and lower dynamic power. However, traditional nanophotonics options face limitations such as large device footprints compared with electronics, higher static power due to continuous laser operation, and an upper limit on achievable data rates due to large device capacitances. Nanoplasmonics is an emerging technology that has the potential for providing transformative gains on multiple metrics due to its potential to increase the light-matter interaction. In this paper, we propose and analyze a hybrid opto-electric NoC that incorporates Hybrid Plasmonics Photonics Interconnect (HyPPI), an optical interconnect that combines photonics with plasmonics. We explore various opto-electronic network hybridization options by augmenting a mesh network with HyPPI links, and compare them with the equivalent options afforded by conventional nanophotonics as well as pure electronics. Our design space exploration indicates that augmenting an electronic NoC with HyPPI gives a performance to cost ratio improvement of up to 1.8x. To further validate our estimates, we conduct trace based simulations using the NAS Parallel Benchmark suite. These benchmarks show latency improvements up to 1.64x, with negligible energy increase. We then further carry out performance and cost projections for fully optical NoCs, using HyPPI as well as conventional nanophotonics. These futuristic projections indicate that all-HyPPI NoCs would be two orders more energy efficient than electronics, and two orders more area efficient than all-photonic NoCs.
△ Less
Submitted 14 March, 2017;
originally announced March 2017.
-
Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks
Authors:
Peter L. Bartlett,
Nick Harvey,
Chris Liaw,
Abbas Mehrabian
Abstract:
We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $Ω( W L \log(W/L) )$. This improves both the previously kn…
▽ More
We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $Ω( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear units, we prove a tight bound $Θ(W U)$ on the VC-dimension. All of these bounds generalize to arbitrary piecewise linear activation functions, and also hold for the pseudodimensions of these function classes.
Combined with previous results, this gives an intriguing range of dependencies of the VC-dimension on depth for networks with different non-linearities: there is no dependence for piecewise-constant, linear dependence for piecewise-linear, and no more than quadratic dependence for general piecewise-polynomial.
△ Less
Submitted 15 October, 2017; v1 submitted 8 March, 2017;
originally announced March 2017.
-
A Universal Multi-Hierarchy Figure-of-Merit for On-Chip Computing and Communications
Authors:
Shuai Sun,
Vikram K. Narayana,
Armin Mehrabian,
Tarek El-Ghazawi,
Volker J. Sorger
Abstract:
Continuing demands for increased compute efficiency and communication bandwidth have led to the development of novel interconnect technologies with the potential to outperform conventional electrical interconnects. With a plurality of interconnect technologies to include electronics, photonics, plasmonics, and hybrids thereof, the simple approach of counting on-chip devices to capture performance…
▽ More
Continuing demands for increased compute efficiency and communication bandwidth have led to the development of novel interconnect technologies with the potential to outperform conventional electrical interconnects. With a plurality of interconnect technologies to include electronics, photonics, plasmonics, and hybrids thereof, the simple approach of counting on-chip devices to capture performance is insufficient. While some efforts have been made to capture the performance evolution more accurately, they eventually deviate from the observed development pace. Thus, a holistic figure of merit (FOM) is needed to adequately compare these recent technology paradigms. Here we introduce the Capability-to-Latency-Energy-Amount-Resistance (CLEAR) FOM derived from device and link performance criteria of both active optoelectronic devices and passive components alike. As such CLEAR incorporates communication delay, energy efficiency, on-chip scaling and economic cost. We show that CLEAR accurately describes compute development including most recent machines. Since this FOM is derived bottom-up, we demonstrate remarkable adaptability to applications ranging from device-level to network and system-level. Applying CLEAR to benchmark device, link, and network performance against fundamental physical compute and communication limits shows that photonics is competitive even for fractions of the die-size, thus making a case for on-chip optical interconnects.
△ Less
Submitted 7 December, 2016;
originally announced December 2016.
-
Rumours spread slowly in a small world spatial network
Authors:
Jeannette Janssen,
Abbas Mehrabian
Abstract:
Rumour spreading is a protocol for modelling the spread of information through a network via user-to-user interaction. The Spatial Preferred Attachment (SPA) model is a random graph model for complex networks: vertices are placed in a metric space, and the link probability depends on the metric distance between vertices, and on their degree. We show that the SPA model typically produces graphs tha…
▽ More
Rumour spreading is a protocol for modelling the spread of information through a network via user-to-user interaction. The Spatial Preferred Attachment (SPA) model is a random graph model for complex networks: vertices are placed in a metric space, and the link probability depends on the metric distance between vertices, and on their degree. We show that the SPA model typically produces graphs that have small effective diameter, i.e. $O(\log^2 n)$, while rumour spreading is relatively slow, namely polynomial in $n$.
△ Less
Submitted 5 July, 2016;
originally announced August 2016.
-
A simple tool for bounding the deviation of random matrices on geometric sets
Authors:
Christopher Liaw,
Abbas Mehrabian,
Yaniv Plan,
Roman Vershynin
Abstract:
Let $A$ be an isotropic, sub-gaussian $m \times n$ matrix. We prove that the process $Z_x := \|Ax\|_2 - \sqrt m \|x\|_2$ has sub-gaussian increments. Using this, we show that for any bounded set $T \subseteq \mathbb{R}^n$, the deviation of $\|Ax\|_2$ around its mean is uniformly bounded by the Gaussian complexity of $T$. We also prove a local version of this theorem, which allows for unbounded set…
▽ More
Let $A$ be an isotropic, sub-gaussian $m \times n$ matrix. We prove that the process $Z_x := \|Ax\|_2 - \sqrt m \|x\|_2$ has sub-gaussian increments. Using this, we show that for any bounded set $T \subseteq \mathbb{R}^n$, the deviation of $\|Ax\|_2$ around its mean is uniformly bounded by the Gaussian complexity of $T$. We also prove a local version of this theorem, which allows for unbounded sets. These theorems have various applications, some of which are reviewed in this paper. In particular, we give a new result regarding model selection in the constrained linear model.
△ Less
Submitted 7 June, 2016; v1 submitted 2 March, 2016;
originally announced March 2016.
-
On the push&pull protocol for rumour spreading
Authors:
Huseyin Acan,
Andrea Collevecchio,
Abbas Mehrabian,
Nick Wormald
Abstract:
The asynchronous push&pull protocol, a randomized distributed algorithm for spreading a rumour in a graph $G$, works as follows. Independent Poisson clocks of rate 1 are associated with the vertices of $G$. Initially, one vertex of $G$ knows the rumour. Whenever the clock of a vertex $x$ rings, it calls a random neighbour $y$: if $x$ knows the rumour and $y$ does not, then $x$ tells $y$ the rumour…
▽ More
The asynchronous push&pull protocol, a randomized distributed algorithm for spreading a rumour in a graph $G$, works as follows. Independent Poisson clocks of rate 1 are associated with the vertices of $G$. Initially, one vertex of $G$ knows the rumour. Whenever the clock of a vertex $x$ rings, it calls a random neighbour $y$: if $x$ knows the rumour and $y$ does not, then $x$ tells $y$ the rumour (a push operation), and if $x$ does not know the rumour and $y$ knows it, $y$ tells $x$ the rumour (a pull operation). The average spread time of $G$ is the expected time it takes for all vertices to know the rumour, and the guaranteed spread time of $G$ is the smallest time $t$ such that with probability at least $1-1/n$, after time $t$ all vertices know the rumour. The synchronous variant of this protocol, in which each clock rings precisely at times $1,2,\dots$, has been studied extensively. We prove the following results for any $n$-vertex graph: In either version, the average spread time is at most linear even if only the pull operation is used, and the guaranteed spread time is within a logarithmic factor of the average spread time, so it is $O(n\log n)$. In the asynchronous version, both the average and guaranteed spread times are $Ω(\log n)$. We give examples of graphs illustrating that these bounds are best possible up to constant factors. We also prove theoretical relationships between the guaranteed spread times in the two versions. Firstly, in all graphs the guaranteed spread time in the asynchronous version is within an $O(\log n)$ factor of that in the synchronous version, and this is tight. Next, we find examples of graphs whose asynchronous spread times are logarithmic, but the synchronous versions are polynomially large. Finally, we show for any graph that the ratio of the synchronous spread time to the asynchronous spread time is $O(n^{2/3})$.
△ Less
Submitted 15 September, 2015; v1 submitted 4 November, 2014;
originally announced November 2014.
-
Randomized Rumor Spreading in Poorly Connected Small-World Networks
Authors:
Abbas Mehrabian,
Ali Pourmiri
Abstract:
Push-Pull is a well-studied round-robin rumor spreading protocol defined as follows: initially a node knows a rumor and wants to spread it to all nodes in a network quickly. In each round, every informed node sends the rumor to a random neighbor, and every uninformed node contacts a random neighbor and gets the rumor from her if she knows it. We analyze this protocol on random $k$-trees, a class o…
▽ More
Push-Pull is a well-studied round-robin rumor spreading protocol defined as follows: initially a node knows a rumor and wants to spread it to all nodes in a network quickly. In each round, every informed node sends the rumor to a random neighbor, and every uninformed node contacts a random neighbor and gets the rumor from her if she knows it. We analyze this protocol on random $k$-trees, a class of power law graphs, which are small-world and have large clustering coefficients, built as follows: initially we have a $k$-clique. In every step a new node is born, a random $k$-clique of the current graph is chosen, and the new node is joined to all nodes of the $k$-clique. When $k>1$ is fixed, we show that if initially a random node is aware of the rumor, then with probability $1-o(1)$ after $O\left((\log n)^{1+2/k}\cdot\log\log n\cdot f(n)\right)$ rounds the rumor propagates to $n-o(n)$ nodes, where $n$ is the number of nodes and $f(n)$ is any slowly growing function. Since these graphs have polynomially small conductance, vertex expansion $O(1/n)$ and constant treewidth, these results demonstrate that Push-Pull can be efficient even on poorly connected networks. On the negative side, we prove that with probability $1-o(1)$ the protocol needs at least $Ω\left(n^{(k-1)/(k^2+k-1)}/f^2(n)\right)$ rounds to inform all nodes. This exponential dichotomy between time required for informing almost all and all nodes is striking. Our main contribution is to present, for the first time, a natural class of random graphs in which such a phenomenon can be observed. Our technique for proving the upper bound carries over to a closely related class of graphs, random $k$-Apollonian networks, for which we prove an upper bound of $O\left((\log n)^{c_k}\cdot\log\log n\cdot f(n)\right)$ rounds for informing $n-o(n)$ nodes with probability $1-o(1)$ when $k>2$ is fixed. Here, $c_k=(k^2-3)/(k-1)^2<1 + 2/k$.
△ Less
Submitted 29 October, 2014;
originally announced October 2014.
-
Justifying the small-world phenomenon via random recursive trees
Authors:
Abbas Mehrabian
Abstract:
We present a new technique for proving logarithmic upper bounds for diameters of evolving random graph models, which is based on defining a coupling between random graphs and variants of random recursive trees. The advantage of the technique is three-fold: it is quite simple and provides short proofs, it is applicable to a broad variety of models including those incorporating preferential attachme…
▽ More
We present a new technique for proving logarithmic upper bounds for diameters of evolving random graph models, which is based on defining a coupling between random graphs and variants of random recursive trees. The advantage of the technique is three-fold: it is quite simple and provides short proofs, it is applicable to a broad variety of models including those incorporating preferential attachment, and it provides bounds with small constants. We illustrate this by proving, for the first time, logarithmic upper bounds for the diameters of the following well known models: the forest fire model, the copying model, the PageRank-based selection model, the Aiello-Chung-Lu models, the generalized linear preference model, directed scale-free graphs, the Cooper-Frieze model, and random unordered increasing $k$-trees. Our results shed light on why the small-world phenomenon is observed in so many real-world graphs.
△ Less
Submitted 23 October, 2014;
originally announced October 2014.
-
It's a Small World for Random Surfers
Authors:
Abbas Mehrabian,
Nick Wormald
Abstract:
We prove logarithmic upper bounds for the diameters of the random-surfer Webgraph model and the PageRank-based selection Webgraph model, confirming the small world phenomenon holds for them. In the special case when the generated graph is a tree, we provide close lower and upper bounds for the diameters of both models.
We prove logarithmic upper bounds for the diameters of the random-surfer Webgraph model and the PageRank-based selection Webgraph model, confirming the small world phenomenon holds for them. In the special case when the generated graph is a tree, we provide close lower and upper bounds for the diameters of both models.
△ Less
Submitted 28 April, 2014;
originally announced April 2014.
-
Longest paths in random Apollonian networks and largest $r$-ary subtrees of random $d$-ary recursive trees
Authors:
Andrea Collevecchio,
Abbas Mehrabian,
Nick Wormald
Abstract:
Let $r$ and $d$ be positive integers with $r<d$. Consider a random $d$-ary tree constructed as follows. Start with a single vertex, and in each time-step choose a uniformly random leaf and give it $d$ newly created offspring. Let ${\mathcal T}_t$ be the tree produced after $t$ steps. We show that there exists a fixed $δ<1$ depending on $d$ and $r$ such that almost surely for all large $t$, every…
▽ More
Let $r$ and $d$ be positive integers with $r<d$. Consider a random $d$-ary tree constructed as follows. Start with a single vertex, and in each time-step choose a uniformly random leaf and give it $d$ newly created offspring. Let ${\mathcal T}_t$ be the tree produced after $t$ steps. We show that there exists a fixed $δ<1$ depending on $d$ and $r$ such that almost surely for all large $t$, every $r$-ary subtree of ${\mathcal T}_t$ has less than $t^δ$ vertices.
The proof involves analysis that also yields a related result. Consider the following iterative construction of a random planar triangulation. Start with a triangle embedded in the plane. In each step, choose a bounded face uniformly at random, add a vertex inside that face and join it to the vertices of the face. In this way, one face is destroyed and three new faces are created. After $t$ steps, we obtain a random triangulated plane graph with $t+3$ vertices, which is called a random Apollonian network. We prove that there exists a fixed $δ<1$, such that eventually every path in this graph has length less than $t^δ$, which verifies a conjecture of Cooper and Frieze.
△ Less
Submitted 11 April, 2014; v1 submitted 9 April, 2014;
originally announced April 2014.
-
On the Longest Paths and the Diameter in Random Apollonian Networks
Authors:
Ehsan Ebrahimzadeh,
Linda Farczadi,
Pu Gao,
Abbas Mehrabian,
Cristiane M. Sato,
Nick Wormald,
Jonathan Zung
Abstract:
We consider the following iterative construction of a random planar triangulation. Start with a triangle embedded in the plane. In each step, choose a bounded face uniformly at random, add a vertex inside that face and join it to the vertices of the face. After n-3 steps, we obtain a random triangulated plane graph with n vertices, which is called a Random Apollonian Network (RAN). We show that as…
▽ More
We consider the following iterative construction of a random planar triangulation. Start with a triangle embedded in the plane. In each step, choose a bounded face uniformly at random, add a vertex inside that face and join it to the vertices of the face. After n-3 steps, we obtain a random triangulated plane graph with n vertices, which is called a Random Apollonian Network (RAN). We show that asymptotically almost surely (a.a.s.) every path in a RAN has length o(n), refuting a conjecture of Frieze and Tsourakakis. We also show that a RAN always has a path of length (2n-5)^{log 2/log 3}, and that the expected length of its longest path is Omega(n^0.88). Finally, we prove that a.a.s. the diameter of a RAN is asymptotic to c log n, where c \approx 1.668 is the solution of an explicit equation.
△ Less
Submitted 21 March, 2013;
originally announced March 2013.
-
On the Maximum Density of Graphs with Good Edge-Labellings
Authors:
Abbas Mehrabian,
Dieter Mitsche,
Paweł Prałat
Abstract:
A good edge-labelling of a simple, finite graph is a labelling of its edges with real numbers such that, for every ordered pair of vertices (u,v), there is at most one nondecreasing path from u to v. In this paper we prove that any graph on n vertices that admits a good edge-labelling has at most n log_2(n)/2 edges, and that this bound is tight for infinitely many values of n. Thus we significantl…
▽ More
A good edge-labelling of a simple, finite graph is a labelling of its edges with real numbers such that, for every ordered pair of vertices (u,v), there is at most one nondecreasing path from u to v. In this paper we prove that any graph on n vertices that admits a good edge-labelling has at most n log_2(n)/2 edges, and that this bound is tight for infinitely many values of n. Thus we significantly improve on the previously best known bounds. The main tool of the proof is a combinatorial lemma which might be of independent interest. For every n we also construct an n-vertex graph that admits a good edge-labelling and has n log_2(n)/2 - O(n) edges.
△ Less
Submitted 12 November, 2012;
originally announced November 2012.
-
On the Stretch Factor of Randomly Embedded Random Graphs
Authors:
Abbas Mehrabian,
Nick Wormald
Abstract:
We consider a random graph G(n,p) whose vertex set V has been randomly embedded in the unit square and whose edges are given weight equal to the geometric distance between their end vertices. Then each pair {u,v} of vertices have a distance in the weighted graph, and a Euclidean distance. The stretch factor of the embedded graph is defined as the maximum ratio of these two distances, over all u,v…
▽ More
We consider a random graph G(n,p) whose vertex set V has been randomly embedded in the unit square and whose edges are given weight equal to the geometric distance between their end vertices. Then each pair {u,v} of vertices have a distance in the weighted graph, and a Euclidean distance. The stretch factor of the embedded graph is defined as the maximum ratio of these two distances, over all u,v in V. We give upper and lower bounds on the stretch factor (holding asymptotically almost surely), and show that for p not too close to 0 or 1, these bounds are best possible in a certain sense. Our results imply that the stretch factor is bounded with probability tending to 1 if and only if n(1-p) tends to 0, answering a question of O'Rourke.
△ Less
Submitted 28 May, 2012;
originally announced May 2012.
-
On a Bounded Budget Network Creation Game
Authors:
Shayan Ehsani,
Saber Shokat Fadaee,
MohammadAmin Fazli,
Abbas Mehrabian,
Sina Sadeghian Sadeghabad,
MohammadAli Safari,
Morteza Saghafian
Abstract:
We consider a network creation game in which each player (vertex) has a fixed budget to establish links to other players. In our model, each link has unit price and each agent tries to minimize its cost, which is either its local diameter or its total distance to other players in the (undirected) underlying graph of the created network. Two versions of the game are studied: in the MAX version, the…
▽ More
We consider a network creation game in which each player (vertex) has a fixed budget to establish links to other players. In our model, each link has unit price and each agent tries to minimize its cost, which is either its local diameter or its total distance to other players in the (undirected) underlying graph of the created network. Two versions of the game are studied: in the MAX version, the cost incurred to a vertex is the maximum distance between the vertex and other vertices, and in the SUM version, the cost incurred to a vertex is the sum of distances between the vertex and other vertices. We prove that in both versions pure Nash equilibria exist, but the problem of finding the best response of a vertex is NP-hard. We take the social cost of the created network to be its diameter, and next we study the maximum possible diameter of an equilibrium graph with n vertices in various cases. When the sum of players' budgets is n-1, the equilibrium graphs are always trees, and we prove that their maximum diameter is Theta(n) and Theta(log n) in MAX and SUM versions, respectively. When each vertex has unit budget (i.e. can establish link to just one vertex), the diameter of any equilibrium graph in either version is Theta(1). We give examples of equilibrium graphs in the MAX version, such that all vertices have positive budgets and yet the diameter is Omega(sqrt(log n)). This interesting (and perhaps counter-intuitive) result shows that increasing the budgets may increase the diameter of equilibrium graphs and hence deteriorate the network structure. Then we prove that every equilibrium graph in the SUM version has diameter 2^O(sqrt(log n)). Finally, we show that if the budget of each player is at least k, then every equilibrium graph in the SUM version is k-connected or has diameter smaller than 4.
△ Less
Submitted 10 June, 2012; v1 submitted 2 November, 2011;
originally announced November 2011.
-
On the density of nearly regular graphs with a good edge-labelling
Authors:
Abbas Mehrabian
Abstract:
A good edge-labelling of a simple graph is a labelling of its edges with real numbers such that, for any ordered pair of vertices (u,v), there is at most one nondecreasing path from u to v. Say a graph is good if it admits a good edge-labelling, and is bad otherwise. Our main result is that any good n-vertex graph whose maximum degree is within a constant factor of its average degree (in particula…
▽ More
A good edge-labelling of a simple graph is a labelling of its edges with real numbers such that, for any ordered pair of vertices (u,v), there is at most one nondecreasing path from u to v. Say a graph is good if it admits a good edge-labelling, and is bad otherwise. Our main result is that any good n-vertex graph whose maximum degree is within a constant factor of its average degree (in particular, any good regular graph) has at most n^{1+o(1)} edges. As a corollary, we show that there are bad graphs with arbitrarily large girth, answering a question of Bode, Farzad and Theis. We also prove that for any Delta, there is a g such that any graph with maximum degree at most Delta and girth at least g is good.
△ Less
Submitted 11 October, 2011;
originally announced October 2011.
-
Cops and Robber Game with a Fast Robber on Expander Graphs and Random Graphs
Authors:
Abbas Mehrabian
Abstract:
We consider a variant of the Cops and Robber game, in which the robber has unbounded speed, i.e. can take any path from her vertex in her turn, but she is not allowed to pass through a vertex occupied by a cop. Let c_{infty}(G) denote the number of cops needed to capture the robber in a graph G in this variant. We characterize graphs G with c_{infty}(G)=1, and give an O(|V(G)|^2) algorithm for the…
▽ More
We consider a variant of the Cops and Robber game, in which the robber has unbounded speed, i.e. can take any path from her vertex in her turn, but she is not allowed to pass through a vertex occupied by a cop. Let c_{infty}(G) denote the number of cops needed to capture the robber in a graph G in this variant. We characterize graphs G with c_{infty}(G)=1, and give an O(|V(G)|^2) algorithm for their detection. We prove a lower bound for c_{infty} of expander graphs, and use it to prove three things. The first is that if np > 4.2 log n then the random graph G = G(n,p) asymptotically almost surely has e1/p < c_{infty}(G) < e2 log (np)/p, for suitable constants e1 and e2. The second is that a fixed-degree random regular graph G with n vertices asymptotically almost surely has c_{infty}(G) = Theta(n). The third is that if G is a Cartesian product of m paths, then n / 4km^2 < c_{infty}(G) < n / k, where n=|V(G)| and k is the number of vertices of the longest path.
△ Less
Submitted 13 May, 2011;
originally announced May 2011.
-
The capture time of grids
Authors:
Abbas Mehrabian
Abstract:
We consider the game of Cops and Robber played on the Cartesian product of two trees. Assuming the players play perfectly, it is shown that if there are two cops in the game, then the length of the game (known as the 2-capture time of the graph) is equal to half the diameter of the graph. In particular, the 2-capture time of the m x n grid is proved to be floor ((m+n-2)/2).
We consider the game of Cops and Robber played on the Cartesian product of two trees. Assuming the players play perfectly, it is shown that if there are two cops in the game, then the length of the game (known as the 2-capture time of the graph) is equal to half the diameter of the graph. In particular, the 2-capture time of the m x n grid is proved to be floor ((m+n-2)/2).
△ Less
Submitted 31 October, 2010; v1 submitted 25 August, 2010;
originally announced August 2010.
-
Cops and Robber Game with a Fast Robber on Interval, Chordal, and Planar Graphs
Authors:
Abbas Mehrabian
Abstract:
We consider a variant of the Cops and Robber game, introduced by Fomin, Golovach, Kratochvil, in which the robber has unbounded speed, i.e. can take any path from her vertex in her turn, but she is not allowed to pass through a vertex occupied by a cop. We study this game on interval graphs, chordal graphs, planar graphs, and hypercube graphs. Let c_{\infty}(G) denote the number of cops needed to…
▽ More
We consider a variant of the Cops and Robber game, introduced by Fomin, Golovach, Kratochvil, in which the robber has unbounded speed, i.e. can take any path from her vertex in her turn, but she is not allowed to pass through a vertex occupied by a cop. We study this game on interval graphs, chordal graphs, planar graphs, and hypercube graphs. Let c_{\infty}(G) denote the number of cops needed to capture the robber in graph G in this variant. We show that if G is an interval graph, then c_{\infty}(G) = O(sqrt(|V(G)|)), and we give a polynomial-time 3-approximation algorithm for finding c_{\infty}(G) in interval graphs. We prove that for every n there exists an n-vertex chordal graph G with c_{\infty}(G) = Omega(n / \log n). Let tw(G) and Delta(G) denote the treewidth and the maximum degree of G, respectively. We prove that for every G, tw(G) + 1 \leq (Delta(G) + 1) c_{\infty}(G). Using this lower bound for c_{\infty}(G), we show two things. The first is that if G is a planar graph (or more generally, if G does not have a fixed apex graph as a minor), then c_{\infty}(G) = Theta(tw(G)). This immediately leads to an O(1)-approximation algorithm for computing c_{\infty} for planar graphs. The second is that if G is the m-hypercube graph, then there exist constants eta1, eta2>0 such that (eta1) 2^m / (m sqrt(m)) \leq c_{\infty}(G) \leq (eta2) 2^m / m.
△ Less
Submitted 16 April, 2011; v1 submitted 25 August, 2010;
originally announced August 2010.
-
Lower Bounds for the Cop Number When the Robber is Fast
Authors:
Abbas Mehrabian
Abstract:
We consider a variant of the Cops and Robbers game where the robber can move t edges at a time, and show that in this variant, the cop number of a d-regular graph with girth larger than 2t+2 is Omega(d^t). By the known upper bounds on the order of cages, this implies that the cop number of a connected n-vertex graph can be as large as Omega(n^{2/3}) if t>1, and Omega(n^{4/5}) if t>3. This improves…
▽ More
We consider a variant of the Cops and Robbers game where the robber can move t edges at a time, and show that in this variant, the cop number of a d-regular graph with girth larger than 2t+2 is Omega(d^t). By the known upper bounds on the order of cages, this implies that the cop number of a connected n-vertex graph can be as large as Omega(n^{2/3}) if t>1, and Omega(n^{4/5}) if t>3. This improves the Omega(n^{(t-3)/(t-2)}) lower bound of Frieze, Krivelevich, and Loh (Variations on Cops and Robbers, J. Graph Theory, 2011) when 1<t<7. We also conjecture a general upper bound O(n^{t/t+1}) for the cop number in this variant, generalizing Meyniel's conjecture.
△ Less
Submitted 10 May, 2011; v1 submitted 10 July, 2010;
originally announced July 2010.