-
Data Science In Olfaction
Authors:
Vivek Agarwal,
Joshua Harvey,
Dmitry Rinberg,
Vasant Dhar
Abstract:
Advances in neural sensing technology are making it possible to observe the olfactory process in great detail. In this paper, we conceptualize smell from a Data Science and AI perspective, that relates the properties of odorants to how they are sensed and analyzed in the olfactory system from the nose to the brain. Drawing distinctions to color vision, we argue that smell presents unique measureme…
▽ More
Advances in neural sensing technology are making it possible to observe the olfactory process in great detail. In this paper, we conceptualize smell from a Data Science and AI perspective, that relates the properties of odorants to how they are sensed and analyzed in the olfactory system from the nose to the brain. Drawing distinctions to color vision, we argue that smell presents unique measurement challenges, including the complexity of stimuli, the high dimensionality of the sensory apparatus, as well as what constitutes ground truth. In the face of these challenges, we argue for the centrality of odorant-receptor interactions in develo** a theory of olfaction. Such a theory is likely to find widespread industrial applications, and enhance our understanding of smell, and in the longer-term, how it relates to other senses and language. As an initial use case of the data, we present results using machine learning-based classification of neural responses to odors as they are recorded in the mouse olfactory bulb with calcium imaging.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection
Authors:
Abdul Aziz,
Nhat Pham,
Neel Vora,
Cody Reynolds,
Jaime Lehnen,
Pooja Venkatesh,
Zhuoran Yao,
Jay Harvey,
Tam Vu,
Kan Ding,
Phuc Nguyen
Abstract:
Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal…
▽ More
Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scalp-based EEG test, despite being the gold standard for diagnosing epilepsy, is costly, necessitates hospitalization, demands skilled professionals for operation, and is discomforting for users. In this paper, we propose EarSD, a novel lightweight, unobtrusive, and socially acceptable ear-worn system to detect epileptic seizure onsets by measuring the physiological signals from behind the user's ears. EarSD includes an integrated custom-built sensing, computing, and communication PCB to collect and amplify the signals of interest, remove the noises caused by motion artifacts and environmental impacts, and stream the data wirelessly to the computer or mobile phone nearby, where data are uploaded to the host computer for further processing. We conducted both in-lab and in-hospital experiments with epileptic seizure patients who were hospitalized for seizure studies. The preliminary results confirm that EarSD can detect seizures with up to 95.3 percent accuracy by just using classical machine learning algorithms.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression
Authors:
Neel R Vora,
Amir Hajighasemi,
Cody T. Reynolds,
Amirmohammad Radmehr,
Mohamed Mohamed,
Jillur Rahman Saurav,
Abdul Aziz,
Jai Prakash Veerla,
Mohammad S Nasr,
Hayden Lotspeich,
Partha Sai Guttikonda,
Thuong Pham,
Aarti Darji,
Parisa Boodaghi Malidarreh,
Helen H Shang,
Jay Harvey,
Kan Ding,
Phuc Nguyen,
Jacob M Luber
Abstract:
Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases.
However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit…
▽ More
Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases.
However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monitoring wearables.
This paper presents a novel deep-learning framework employing a variational autoencoder (VAE) for physiological signal compression to reduce wearables' computational complexity and energy consumption.
Our approach achieves an impressive compression ratio of 1:293 specifically for spectrogram data, surpassing state-of-the-art compression techniques such as JPEG2000, H.264, Direct Cosine Transform (DCT), and Huffman Encoding, which do not excel in handling physiological signals.
We validate the efficacy of the compressed algorithms using collected physiological signals from real patients in the Hospital and deploy the solution on commonly used embedded AI chips (i.e., ARM Cortex V8 and Jetson Nano). The proposed framework achieves a 91% seizure detection accuracy using XGBoost, confirming the approach's reliability, practicality, and scalability.
△ Less
Submitted 4 January, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán
Authors:
Andrew J. Charlton-Perez,
Helen F. Dacre,
Simon Driscoll,
Suzanne L. Gray,
Ben Harvey,
Natalie J. Harvey,
Kieran M. R. Hunt,
Robert W. Lee,
Ran**i Swaminathan,
Remy Vandaele,
Ambrogio Volonté
Abstract:
There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and ext…
▽ More
There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and extensive damage in Northern Europe, made by machine learning and numerical weather prediction models. The four machine learning models considered (FourCastNet, Pangu-Weather, GraphCast and FourCastNet-v2) produce forecasts that accurately capture the synoptic-scale structure of the cyclone including the position of the cloud head, shape of the warm sector and location of warm conveyor belt jet, and the large-scale dynamical drivers important for the rapid storm development such as the position of the storm relative to the upper-level jet exit. However, their ability to resolve the more detailed structures important for issuing weather warnings is more mixed. All of the machine learning models underestimate the peak amplitude of winds associated with the storm, only some machine learning models resolve the warm core seclusion and none of the machine learning models capture the sharp bent-back warm frontal gradient. Our study shows there is a great deal about the performance and properties of machine learning weather forecasts that can be derived from case studies of high-impact weather events such as Storm Ciarán.
△ Less
Submitted 19 February, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Stabilizing Machine Learning Prediction of Dynamics: Noise and Noise-inspired Regularization
Authors:
Alexander Wikner,
Joseph Harvey,
Michelle Girvan,
Brian R. Hunt,
Andrew Pomerance,
Thomas Antonsen,
Edward Ott
Abstract:
Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model o…
▽ More
Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth. In this article, we systematically examine the technique of adding noise to the ML model input during training to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machine-learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto-Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with other regularization techniques that yield stability in some cases, we find that both short-term and climate predictions from reservoir computers trained with noise or with LMNT are substantially more accurate. Finally, we show that the deterministic aspect of our LMNT regularization facilitates fast hyperparameter tuning when compared to training with noise.
△ Less
Submitted 12 December, 2022; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Continuous Prediction with Experts' Advice
Authors:
Victor Sanches Portella,
Christopher Liaw,
Nicholas J. A. Harvey
Abstract:
Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning.
In this paper, we employ continuous-time stochastic…
▽ More
Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning.
In this paper, we employ continuous-time stochastic calculus in order to study the discrete-time experts' problem. We use these tools to design a continuous-time, parameter-free algorithm with improved guarantees for the quantile regret. We then develop an analogous discrete-time algorithm with a very similar analysis and identical quantile regret bounds. Finally, we design an anytime continuous-time algorithm with regret matching the optimal fixed-time rate when the gains are independent Brownian Motions; in many settings, this is the most difficult case. This gives some evidence that, even with adversarial gains, the optimal anytime and fixed-time regrets may coincide.
△ Less
Submitted 30 September, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Efficient and Optimal Fixed-Time Regret with Two Experts
Authors:
Laura Greenstreet,
Nicholas J. A. Harvey,
Victor Sanches Portella
Abstract:
Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is known beforehand. Moreover, this is asymptotically optimal when both $T$ and $n$ grow to infinity. However, when the number of experts $n$ is small/fixed, algorithms with b…
▽ More
Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is known beforehand. Moreover, this is asymptotically optimal when both $T$ and $n$ grow to infinity. However, when the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist. Cover showed in 1967 a dynamic programming algorithm for the two-experts problem restricted to $\{0,1\}$ costs that suffers at most $\sqrt{T/2π} + O(1)$ regret with $O(T^2)$ pre-processing time. In this work, we propose an optimal algorithm for prediction with two experts' advice that works even for costs in $[0,1]$ and with $O(1)$ processing time per turn. Our algorithm builds up on recent work on the experts problem based on techniques and tools from stochastic calculus.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses
Authors:
Yihan Zhou,
Victor S. Portella,
Mark Schmidt,
Nicholas J. A. Harvey
Abstract:
In online convex optimization (OCO), Lipschitz continuity of the functions is commonly assumed in order to obtain sublinear regret. Moreover, many algorithms have only logarithmic regret when these functions are also strongly convex. Recently, researchers from convex optimization proposed the notions of "relative Lipschitz continuity" and "relative strong convexity". Both of the notions are genera…
▽ More
In online convex optimization (OCO), Lipschitz continuity of the functions is commonly assumed in order to obtain sublinear regret. Moreover, many algorithms have only logarithmic regret when these functions are also strongly convex. Recently, researchers from convex optimization proposed the notions of "relative Lipschitz continuity" and "relative strong convexity". Both of the notions are generalizations of their classical counterparts. It has been shown that subgradient methods in the relative setting have performance analogous to their performance in the classical setting.
In this work, we consider OCO for relative Lipschitz and relative strongly convex functions. We extend the known regret bounds for classical OCO algorithms to the relative setting. Specifically, we show regret bounds for the follow the regularized leader algorithms and a variant of online mirror descent. Due to the generality of these methods, these results yield regret bounds for a wide variety of OCO algorithms. Furthermore, we further extend the results to algorithms with extra regularization such as regularized dual averaging.
△ Less
Submitted 28 December, 2020; v1 submitted 22 October, 2020;
originally announced October 2020.
-
Online mirror descent and dual averaging: kee** pace in the dynamic case
Authors:
Huang Fang,
Nicholas J. A. Harvey,
Victor S. Portella,
Michael P. Friedlander
Abstract:
Online mirror descent (OMD) and dual averaging (DA) -- two fundamental algorithms for online convex optimization -- are known to have very similar (and sometimes identical) performance guarantees when used with a fixed learning rate. Under dynamic learning rates, however, OMD is provably inferior to DA and suffers a linear regret, even in common settings such as prediction with expert advice. We m…
▽ More
Online mirror descent (OMD) and dual averaging (DA) -- two fundamental algorithms for online convex optimization -- are known to have very similar (and sometimes identical) performance guarantees when used with a fixed learning rate. Under dynamic learning rates, however, OMD is provably inferior to DA and suffers a linear regret, even in common settings such as prediction with expert advice. We modify the OMD algorithm through a simple technique that we call stabilization. We give essentially the same abstract regret bound for OMD with stabilization and for DA by modifying the classical OMD convergence analysis in a careful and modular way that allows for straightforward and flexible proofs. Simple corollaries of these bounds show that OMD with stabilization and DA enjoy the same performance guarantees in many applications -- even under dynamic learning rates. We also shed light on the similarities between OMD and DA and show simple conditions under which stabilized-OMD and DA generate the same iterates.
△ Less
Submitted 3 September, 2021; v1 submitted 3 June, 2020;
originally announced June 2020.
-
Introducing PyCross: PyCloudy Rendering Of Shape Software for pseudo 3D ionisation modelling of nebulae
Authors:
K. Fitzgerald,
E. J Harvey,
N. Keaveney,
M. Redman
Abstract:
Research into the processes of photoionised nebulae plays a significant part in our understanding of stellar evolution. It is extremely difficult to visually represent or model ionised nebula, requiring astronomers to employ sophisticated modelling code to derive temperature, density and chemical composition. Existing codes are available that often require steep learning curves and produce models…
▽ More
Research into the processes of photoionised nebulae plays a significant part in our understanding of stellar evolution. It is extremely difficult to visually represent or model ionised nebula, requiring astronomers to employ sophisticated modelling code to derive temperature, density and chemical composition. Existing codes are available that often require steep learning curves and produce models derived from mathematical functions. In this article we will introduce PyCross: PyCloudy Rendering Of Shape Software. This is a pseudo 3D modelling application that generates photoionisation models of optically thin nebulae, created using the Shape software. Currently PyCross has been used for novae and planetary nebulae, and it can be extended to Active Galactic Nuclei or any other type of photoionised axisymmetric nebulae. Functionality, an operational overview, and a scientific pipeline will be described with scenarios where PyCross has been adopted for novae (V5668 Sagittarii (2015) & V4362 Sagittarii (1994)) and a planetary nebula (LoTr1). Unlike the aforementioned photoionised codes this application does not require any coding experience, nor the need to derive complex mathematical models, instead utilising the select features from Cloudy/PyCloudy and Shape. The software was developed using a formal software development lifecycle, written in Python and will work without the need to install any development environments or additional python packages. This application, Shape models and PyCross archive examples are freely available to students, academics and research community on GitHub for download (https://github.com/karolfitzgerald/PyCross_OSX_App).
△ Less
Submitted 6 May, 2020;
originally announced May 2020.
-
Optimal anytime regret with two experts
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Edwin Perkins,
Sikander Randhawa
Abstract:
We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optim…
▽ More
We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optimal algorithm was previously known in the anytime setting, regardless of the number of experts. Even for the case of two experts, Luo and Schapire have left open the problem of determining the optimal algorithm.
We design the first minimax optimal algorithm for minimizing regret in the anytime setting. We consider the case of two experts, and prove that the optimal regret is $γ\sqrt{t} / 2$ at all time steps $t$, where $γ$ is a natural constant that arose 35 years ago in studying fundamental properties of Brownian motion. The algorithm is designed by considering a continuous analogue of the regret problem, which is solved using ideas from stochastic calculus.
△ Less
Submitted 26 August, 2021; v1 submitted 20 February, 2020;
originally announced February 2020.
-
Estimating the reach of a manifold via its convexity defect function
Authors:
Clément Berenfeld,
John Harvey,
Marc Hoffmann,
Krishnan Shankar
Abstract:
The reach of a submanifold is a crucial regularity parameter for manifold learning and geometric inference from point clouds. This paper relates the reach of a submanifold to its convexity defect function. Using the stability properties of convexity defect functions, along with some new bounds and the recent submanifold estimator of Aamari and Levrard [Ann. Statist. 47 177-204 (2019)], an estimato…
▽ More
The reach of a submanifold is a crucial regularity parameter for manifold learning and geometric inference from point clouds. This paper relates the reach of a submanifold to its convexity defect function. Using the stability properties of convexity defect functions, along with some new bounds and the recent submanifold estimator of Aamari and Levrard [Ann. Statist. 47 177-204 (2019)], an estimator for the reach is given. A uniform expected loss bound over a C^k model is found. Lower bounds for the minimax rate for estimating the reach over these models are also provided. The estimator almost achieves these rates in the C^3 and C^4 cases, with a gap given by a logarithmic factor.
△ Less
Submitted 26 March, 2021; v1 submitted 22 January, 2020;
originally announced January 2020.
-
Simple and optimal high-probability bounds for strongly-convex stochastic gradient descent
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Sikander Randhawa
Abstract:
We consider stochastic gradient descent algorithms for minimizing a non-smooth, strongly-convex function. Several forms of this algorithm, including suffix averaging, are known to achieve the optimal $O(1/T)$ convergence rate in expectation. We consider a simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) and prove that it achieves the optimal $O(1/T)$ convergence rate with hig…
▽ More
We consider stochastic gradient descent algorithms for minimizing a non-smooth, strongly-convex function. Several forms of this algorithm, including suffix averaging, are known to achieve the optimal $O(1/T)$ convergence rate in expectation. We consider a simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) and prove that it achieves the optimal $O(1/T)$ convergence rate with high probability. Our proof uses a recently developed generalization of Freedman's inequality. Finally, we compare several of these algorithms experimentally and show that this non-uniform averaging strategy outperforms many standard techniques, and with smaller variance.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
Overt visual attention on rendered 3D objects
Authors:
Oleksii Sidorov,
Joshua S. Harvey,
Hannah E. Smithson,
Jon Y. Hardeberg
Abstract:
This work covers multiple aspects of overt visual attention on 3D renders: measurement, projection, visualization, and application to studying the influence of material appearance on looking behaviour. In the scope of this work, we ran an eye-tracking experiment in which the observers are presented with animations of rotating 3D objects. The objects were rendered to simulate different metallic app…
▽ More
This work covers multiple aspects of overt visual attention on 3D renders: measurement, projection, visualization, and application to studying the influence of material appearance on looking behaviour. In the scope of this work, we ran an eye-tracking experiment in which the observers are presented with animations of rotating 3D objects. The objects were rendered to simulate different metallic appearance, particularly smooth (glossy), rough (matte), and coated gold. The eye-tracking results illustrate how material appearance itself influences the observer's attention, while all the other parameters remain unchanged. In order to make visualization of the attention maps more natural and also make the analysis more accurate, we develop a novel technique of projection of gaze fixations on the 3D surface of the figure itself, instead of the conventional 2D plane of the screen. The proposed methodology will be useful for further studies of attention and saliency in the computer graphics domain.
△ Less
Submitted 24 May, 2019;
originally announced May 2019.
-
Lifelong Testing of Smart Autonomous Systems by Shepherding a Swarm of Watchdog Artificial Intelligence Agents
Authors:
Hussein Abbass,
John Harvey,
Kate Yaxley
Abstract:
Artificial Intelligence (AI) technologies could be broadly categorised into Analytics and Autonomy. Analytics focuses on algorithms offering perception, comprehension, and projection of knowledge gleaned from sensorial data. Autonomy revolves around decision making, and influencing and sha** the environment through action production. A smart autonomous system (SAS) combines analytics and autonom…
▽ More
Artificial Intelligence (AI) technologies could be broadly categorised into Analytics and Autonomy. Analytics focuses on algorithms offering perception, comprehension, and projection of knowledge gleaned from sensorial data. Autonomy revolves around decision making, and influencing and sha** the environment through action production. A smart autonomous system (SAS) combines analytics and autonomy to understand, learn, decide and act autonomously. To be useful, SAS must be trusted and that requires testing. Lifelong learning of a SAS compounds the testing process. In the remote chance that it is possible to fully test and certify the system pre-release, which is theoretically an undecidable problem, it is near impossible to predict the future behaviours that these systems, alone or collectively, will exhibit. While it may be feasible to severely restrict such systems\textquoteright \ learning abilities to limit the potential unpredictability of their behaviours, an undesirable consequence may be severely limiting their utility. In this paper, we propose the architecture for a watchdog AI (WAI) agent dedicated to lifelong functional testing of SAS. We further propose system specifications including a level of abstraction whereby humans shepherd a swarm of WAI agents to oversee an ecosystem made of humans and SAS. The discussion extends to the challenges, pros, and cons of the proposed concept.
△ Less
Submitted 21 December, 2018;
originally announced December 2018.
-
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Yaniv Plan,
Sikander Randhawa
Abstract:
Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessarily differentiable. We prove that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability. We also construct a function from this class for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/T)$. This s…
▽ More
Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessarily differentiable. We prove that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability. We also construct a function from this class for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/T)$. This shows that the upper bound is tight and that, in this setting, the last iterate of stochastic gradient descent has the same general error rate (with high probability) as deterministic gradient descent. This resolves both open questions posed by Shamir (2012).
An intermediate step of our analysis proves that the suffix averaging method achieves error $O(1/T)$ with high probability, which is optimal (for any first-order optimization method). This improves results of Rakhlin (2012) and Hazan and Kale (2014), both of which achieved error $O(1/T)$, but only in expectation, and achieved a high probability error bound of $O(\log \log(T)/T)$, which is suboptimal.
We prove analogous results for functions that are Lipschitz and convex, but not necessarily strongly convex or differentiable. After $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/\sqrt{T})$ with high probability, and there exists a function for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/\sqrt{T})$.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
Machine Learning in High Energy Physics Community White Paper
Authors:
Kim Albertsson,
Piero Altoe,
Dustin Anderson,
John Anderson,
Michael Andrews,
Juan Pedro Araque Espinosa,
Adam Aurisano,
Laurent Basara,
Adrian Bevan,
Wahid Bhimji,
Daniele Bonacorsi,
Bjorn Burkle,
Paolo Calafiura,
Mario Campanelli,
Louis Capps,
Federico Carminati,
Stefano Carrazza,
Yi-fan Chen,
Taylor Childers,
Yann Coadou,
Elias Coniavitis,
Kyle Cranmer,
Claire David,
Douglas Davis,
Andrea De Simone
, et al. (103 additional authors not shown)
Abstract:
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d…
▽ More
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.
△ Less
Submitted 16 May, 2019; v1 submitted 8 July, 2018;
originally announced July 2018.
-
Greedy and Local Ratio Algorithms in the MapReduce Model
Authors:
Nicholas J. A. Harvey,
Christopher Liaw,
Paul Liu
Abstract:
MapReduce has become the de facto standard model for designing distributed algorithms to process big data on a cluster. There has been considerable research on designing efficient MapReduce algorithms for clustering, graph optimization, and submodular optimization problems. We develop new techniques for designing greedy and local ratio algorithms in this setting. Our randomized local ratio techniq…
▽ More
MapReduce has become the de facto standard model for designing distributed algorithms to process big data on a cluster. There has been considerable research on designing efficient MapReduce algorithms for clustering, graph optimization, and submodular optimization problems. We develop new techniques for designing greedy and local ratio algorithms in this setting. Our randomized local ratio technique gives $2$-approximations for weighted vertex cover and weighted matching, and an $f$-approximation for weighted set cover, all in a constant number of MapReduce rounds. Our randomized greedy technique gives algorithms for maximal independent set, maximal clique, and a $(1+ε)\ln Δ$-approximation for weighted set cover. We also give greedy algorithms for vertex colouring with $(1+o(1))Δ$ colours and edge colouring with $(1+o(1))Δ$ colours.
△ Less
Submitted 17 June, 2018;
originally announced June 2018.
-
Dimensionality reduction methods for molecular simulations
Authors:
Stefan Doerr,
Igor Ariz-Extreme,
Matthew J. Harvey,
Gianni De Fabritiis
Abstract:
Molecular simulations produce very high-dimensional data-sets with millions of data points. As analysis methods are often unable to cope with so many dimensions, it is common to use dimensionality reduction and clustering methods to reach a reduced representation of the data. Yet these methods often fail to capture the most important features necessary for the construction of a Markov model. Here…
▽ More
Molecular simulations produce very high-dimensional data-sets with millions of data points. As analysis methods are often unable to cope with so many dimensions, it is common to use dimensionality reduction and clustering methods to reach a reduced representation of the data. Yet these methods often fail to capture the most important features necessary for the construction of a Markov model. Here we demonstrate the results of various dimensionality reduction methods on two simulation data-sets, one of protein folding and another of protein-ligand binding. The methods tested include a k-means clustering variant, a non-linear auto encoder, principal component analysis and tICA. The dimension-reduced data is then used to estimate the implied timescales of the slowest process by a Markov state model analysis to assess the quality of the projection. The projected dimensions learned from the data are visualized to demonstrate which conformations the various methods choose to represent the molecular process.
△ Less
Submitted 2 November, 2017; v1 submitted 29 October, 2017;
originally announced October 2017.
-
Computing the Independence Polynomial: from the Tree Threshold down to the Roots
Authors:
Nicholas J. A. Harvey,
Piyush Srivastava,
Jan Vondrák
Abstract:
We study an algorithm for approximating the multivariate independence polynomial $Z(\mathbf{z})$, with negative and complex arguments, an object that has strong connections to combinatorics and to statistical physics. In particular, the independence polynomial with negative arguments, $Z(-\mathbf{p})$, determines the Shearer region, the maximal region of probabilities to which the Lovasz Local Lem…
▽ More
We study an algorithm for approximating the multivariate independence polynomial $Z(\mathbf{z})$, with negative and complex arguments, an object that has strong connections to combinatorics and to statistical physics. In particular, the independence polynomial with negative arguments, $Z(-\mathbf{p})$, determines the Shearer region, the maximal region of probabilities to which the Lovasz Local Lemma (LLL) can be extended (Shearer 1985). In statistical physics, complex zeros of the independence polynomial relate to existence of phase transitions.
Our main result is a deterministic algorithm to compute approximately the independence polynomial in any root-free complex polydisc centered at the origin. Our algorithm is essentially the same as Weitz's algorithm for positive parameters up to the tree uniqueness threshold, and the core of our analysis is a novel multivariate form of the correlation decay technique, which can handle non-uniform complex parameters. In particular, in the univariate real setting our work implies that Weitz's algorithm works in an interval between two critical points $(λ'_c(d), λ_c(d))$, and outside of this interval an approximation of $Z(\mathbf{z})$ is known to be NP-hard.
As an application, we give a sub-exponential time algorithm for testing approximate membership in the Shearer region. We also give a new rounding based deterministic algorithm for Shearer's lemma (an extension of the LLL), which, however, runs in sub-exponential time. On the hardness side, we prove that evaluating $Z(\mathbf{z})$ at an arbitrary point in Shearer's region, and testing membership in Shearer's region, are #P-hard problems. We also establish the best possible dependence of the exponent of the run time of Weitz's correlation decay technique in the negative regime on the distance to the boundary of the Shearer region.
△ Less
Submitted 11 November, 2017; v1 submitted 7 August, 2016;
originally announced August 2016.
-
Pipage Rounding, Pessimistic Estimators and Matrix Concentration
Authors:
Nicholas J. A. Harvey,
Neil Olver
Abstract:
Pipage rounding is a dependent random sampling technique that has several interesting properties and diverse applications. One property that has been particularly useful is negative correlation of the resulting vector. Unfortunately negative correlation has its limitations, and there are some further desirable properties that do not seem to follow from existing techniques. In particular, recent co…
▽ More
Pipage rounding is a dependent random sampling technique that has several interesting properties and diverse applications. One property that has been particularly useful is negative correlation of the resulting vector. Unfortunately negative correlation has its limitations, and there are some further desirable properties that do not seem to follow from existing techniques. In particular, recent concentration results for sums of independent random matrices are not known to extend to a negatively dependent setting.
We introduce a simple but useful technique called concavity of pessimistic estimators. This technique allows us to show concentration of submodular functions and concentration of matrix sums under pipage rounding. The former result answers a question of Chekuri et al. (2009). To prove the latter result, we derive a new variant of Lieb's celebrated concavity theorem in matrix analysis.
We provide numerous applications of these results. One is to spectrally-thin trees, a spectral analog of the thin trees that played a crucial role in the recent breakthrough on the asymmetric traveling salesman problem. We show a polynomial time algorithm that, given a graph where every edge has effective conductance at least $κ$, returns an $O(κ^{-1} \cdot \log n / \log \log n)$-spectrally-thin tree. There are further applications to rounding of semidefinite programs, to the column subset selection problem, and to a geometric question of extracting a nearly-orthonormal basis from an isotropic distribution.
△ Less
Submitted 8 July, 2013;
originally announced July 2013.
-
A linear-time algorithm for finding a complete graph minor in a dense graph
Authors:
Vida Dujmović,
Daniel J. Harvey,
Gwenaël Joret,
Bruce Reed,
David R. Wood
Abstract:
Let g(t) be the minimum number such that every graph G with average degree d(G) \geq g(t) contains a K_{t}-minor. Such a function is known to exist, as originally shown by Mader. Kostochka and Thomason independently proved that g(t) \in Θ(t*sqrt{log t}). This article shows that for all fixed ε> 0 and fixed sufficiently large t \geq t(ε), if d(G) \geq (2+ε)g(t) then we can find this K_{t}-minor in…
▽ More
Let g(t) be the minimum number such that every graph G with average degree d(G) \geq g(t) contains a K_{t}-minor. Such a function is known to exist, as originally shown by Mader. Kostochka and Thomason independently proved that g(t) \in Θ(t*sqrt{log t}). This article shows that for all fixed ε> 0 and fixed sufficiently large t \geq t(ε), if d(G) \geq (2+ε)g(t) then we can find this K_{t}-minor in linear time. This improves a previous result by Reed and Wood who gave a linear-time algorithm when d(G) \geq 2^{t-2}.
△ Less
Submitted 23 April, 2013; v1 submitted 12 February, 2012;
originally announced February 2012.
-
Sparse Sums of Positive Semidefinite Matrices
Authors:
Marcel K. de Carli Silva,
Nicholas J. A. Harvey,
Cristiane M. Sato
Abstract:
Recently there has been much interest in "sparsifying" sums of rank one matrices: modifying the coefficients such that only a few are nonzero, while approximately preserving the matrix that results from the sum. Results of this sort have found applications in many different areas, including sparsifying graphs. In this paper we consider the more general problem of sparsifying sums of positive semid…
▽ More
Recently there has been much interest in "sparsifying" sums of rank one matrices: modifying the coefficients such that only a few are nonzero, while approximately preserving the matrix that results from the sum. Results of this sort have found applications in many different areas, including sparsifying graphs. In this paper we consider the more general problem of sparsifying sums of positive semidefinite matrices that have arbitrary rank.
We give several algorithms for solving this problem. The first algorithm is based on the method of Batson, Spielman and Srivastava (2009). The second algorithm is based on the matrix multiplicative weights update method of Arora and Kale (2007). We also highlight an interesting connection between these two algorithms.
Our algorithms have numerous applications. We show how they can be used to construct graph sparsifiers with auxiliary constraints, sparsifiers of hypergraphs, and sparse solutions to semidefinite programs.
△ Less
Submitted 17 October, 2011; v1 submitted 30 June, 2011;
originally announced July 2011.
-
Submodular Functions: Learnability, Structure, and Optimization
Authors:
Maria-Florina Balcan,
Nicholas J. A. Harvey
Abstract:
Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their…
▽ More
Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their learnability. In doing so, we uncover several novel structural results revealing ways in which submodular functions can be both surprisingly structured and surprisingly unstructured. We provide several concrete implications of our work in other domains including algorithmic game theory and combinatorial optimization.
At a technical level, this research combines ideas from many areas, including learning theory (distributional learning and PAC-style analyses), combinatorics and optimization (matroids and submodular functions), and pseudorandomness (lossless expander graphs).
△ Less
Submitted 21 August, 2012; v1 submitted 12 August, 2010;
originally announced August 2010.
-
Graph Sparsification by Edge-Connectivity and Random Spanning Trees
Authors:
Wai Shing Fung,
Nicholas J. A. Harvey
Abstract:
We present new approaches to constructing graph sparsifiers --- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of $(1 \pm ε)$. Our first approach independently samples each edge $uv$ with probability inversely proportional to the edge-connectivity between $u$ and $v$. The fact that this approach produces a sparsifier resolves a question posed by Ben…
▽ More
We present new approaches to constructing graph sparsifiers --- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of $(1 \pm ε)$. Our first approach independently samples each edge $uv$ with probability inversely proportional to the edge-connectivity between $u$ and $v$. The fact that this approach produces a sparsifier resolves a question posed by Benczúr and Karger (2002). Concurrent work of Hariharan and Panigrahi also resolves this question. Our second approach constructs a sparsifier by forming the union of several uniformly random spanning trees. Both of our approaches produce sparsifiers with $O(n \log^2(n)/ε^2)$ edges. Our proofs are based on extensions of Karger's contraction algorithm, which may be of independent interest.
△ Less
Submitted 9 August, 2010; v1 submitted 3 May, 2010;
originally announced May 2010.
-
The complexity of UNO
Authors:
Erik D. Demaine,
Martin L. Demaine,
Nicholas J. A. Harvey,
Ryuhei Uehara,
Takeaki Uno,
Yushi Uno
Abstract:
This paper investigates the popular card game UNO from the viewpoint of algorithmic combinatorial game theory. We define simple and concise mathematical models for the game, including both cooperative and uncooperative versions, and analyze their computational complexity. In particular, we prove that even a single-player version of UNO is NP-complete, although some restricted cases are in P. Surpr…
▽ More
This paper investigates the popular card game UNO from the viewpoint of algorithmic combinatorial game theory. We define simple and concise mathematical models for the game, including both cooperative and uncooperative versions, and analyze their computational complexity. In particular, we prove that even a single-player version of UNO is NP-complete, although some restricted cases are in P. Surprisingly, we show that the uncooperative two-player version is also in P.
△ Less
Submitted 2 December, 2013; v1 submitted 15 March, 2010;
originally announced March 2010.
-
A Randomized Rounding Algorithm for the Asymmetric Traveling Salesman Problem
Authors:
Michel X. Goemans,
Nicholas J. A. Harvey,
Kamal Jain,
Mohit Singh
Abstract:
We present an algorithm for the asymmetric traveling salesman problem on instances which satisfy the triangle inequality. Like several existing algorithms, it achieves approximation ratio O(log n). Unlike previous algorithms, it uses randomized rounding.
We present an algorithm for the asymmetric traveling salesman problem on instances which satisfy the triangle inequality. Like several existing algorithms, it achieves approximation ratio O(log n). Unlike previous algorithms, it uses randomized rounding.
△ Less
Submitted 4 September, 2009;
originally announced September 2009.
-
Sketching and Streaming Entropy via Approximation Theory
Authors:
Nicholas J. A. Harvey,
Jelani Nelson,
Krzysztof Onak
Abstract:
We conclude a sequence of work by giving near-optimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and near-optimal bounds in the insertion-only model without sketching. Our high-level approach is simple: we gi…
▽ More
We conclude a sequence of work by giving near-optimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and near-optimal bounds in the insertion-only model without sketching. Our high-level approach is simple: we give algorithms to estimate Renyi and Tsallis entropy, and use them to extrapolate an estimate of Shannon entropy. The accuracy of our estimates is proven using approximation theory arguments and extremal properties of Chebyshev polynomials, a technique which may be useful for other problems. Our work also yields the best-known and near-optimal additive approximations for entropy, and hence also for conditional entropy and mutual information.
△ Less
Submitted 25 April, 2008;
originally announced April 2008.
-
Algebraic Structures and Algorithms for Matching and Matroid Problems (Preliminary Version)
Authors:
Nicholas J. A. Harvey
Abstract:
Basic path-matchings, introduced by Cunningham and Geelen (FOCS 1996), are a common generalization of matroid intersection and non-bipartite matching. The main results of this paper are a new algebraic characterization of basic path-matching problems and an algorithm for constructing basic path-matchings in O(n^w) time, where n is the number of vertices and w is the exponent for matrix multiplic…
▽ More
Basic path-matchings, introduced by Cunningham and Geelen (FOCS 1996), are a common generalization of matroid intersection and non-bipartite matching. The main results of this paper are a new algebraic characterization of basic path-matching problems and an algorithm for constructing basic path-matchings in O(n^w) time, where n is the number of vertices and w is the exponent for matrix multiplication. Our algorithms are randomized, and our approach assumes that the given matroids are linear and can be represented over the same field.
Our main results have interesting consequences for several special cases of path-matching problems. For matroid intersection, we obtain an algorithm with running time O(nr^(w-1))=O(nr^1.38), where the matroids have n elements and rank r. This improves the long-standing bound of O(nr^1.62) due to Gabow and Xu (FOCS 1989). Also, we obtain a simple, purely algebraic algorithm for non-bipartite matching with running time O(n^w). This resolves the central open problem of Mucha and Sankowski (FOCS 2004).
△ Less
Submitted 9 January, 2006;
originally announced January 2006.