Search | arXiv e-print repository

Data Science In Olfaction

Authors: Vivek Agarwal, Joshua Harvey, Dmitry Rinberg, Vasant Dhar

Abstract: Advances in neural sensing technology are making it possible to observe the olfactory process in great detail. In this paper, we conceptualize smell from a Data Science and AI perspective, that relates the properties of odorants to how they are sensed and analyzed in the olfactory system from the nose to the brain. Drawing distinctions to color vision, we argue that smell presents unique measureme… ▽ More Advances in neural sensing technology are making it possible to observe the olfactory process in great detail. In this paper, we conceptualize smell from a Data Science and AI perspective, that relates the properties of odorants to how they are sensed and analyzed in the olfactory system from the nose to the brain. Drawing distinctions to color vision, we argue that smell presents unique measurement challenges, including the complexity of stimuli, the high dimensionality of the sensory apparatus, as well as what constitutes ground truth. In the face of these challenges, we argue for the centrality of odorant-receptor interactions in develo** a theory of olfaction. Such a theory is likely to find widespread industrial applications, and enhance our understanding of smell, and in the longer-term, how it relates to other senses and language. As an initial use case of the data, we present results using machine learning-based classification of neural responses to odors as they are recorded in the mouse olfactory bulb with calcium imaging. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 20 pages, 10 Figures, 2 Appendix, 1 Table

arXiv:2401.05425 [pdf, other]

An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection

Authors: Abdul Aziz, Nhat Pham, Neel Vora, Cody Reynolds, Jaime Lehnen, Pooja Venkatesh, Zhuoran Yao, Jay Harvey, Tam Vu, Kan Ding, Phuc Nguyen

Abstract: Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal… ▽ More Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scalp-based EEG test, despite being the gold standard for diagnosing epilepsy, is costly, necessitates hospitalization, demands skilled professionals for operation, and is discomforting for users. In this paper, we propose EarSD, a novel lightweight, unobtrusive, and socially acceptable ear-worn system to detect epileptic seizure onsets by measuring the physiological signals from behind the user's ears. EarSD includes an integrated custom-built sensing, computing, and communication PCB to collect and amplify the signals of interest, remove the noises caused by motion artifacts and environmental impacts, and stream the data wirelessly to the computer or mobile phone nearby, where data are uploaded to the host computer for further processing. We conducted both in-lab and in-hospital experiments with epileptic seizure patients who were hospitalized for seizure studies. The preliminary results confirm that EarSD can detect seizures with up to 95.3 percent accuracy by just using classical machine learning algorithms. △ Less

Submitted 1 January, 2024; originally announced January 2024.

arXiv:2312.12587 [pdf, other]

Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression

Authors: Neel R Vora, Amir Hajighasemi, Cody T. Reynolds, Amirmohammad Radmehr, Mohamed Mohamed, Jillur Rahman Saurav, Abdul Aziz, Jai Prakash Veerla, Mohammad S Nasr, Hayden Lotspeich, Partha Sai Guttikonda, Thuong Pham, Aarti Darji, Parisa Boodaghi Malidarreh, Helen H Shang, Jay Harvey, Kan Ding, Phuc Nguyen, Jacob M Luber

Abstract: Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit… ▽ More Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monitoring wearables. This paper presents a novel deep-learning framework employing a variational autoencoder (VAE) for physiological signal compression to reduce wearables' computational complexity and energy consumption. Our approach achieves an impressive compression ratio of 1:293 specifically for spectrogram data, surpassing state-of-the-art compression techniques such as JPEG2000, H.264, Direct Cosine Transform (DCT), and Huffman Encoding, which do not excel in handling physiological signals. We validate the efficacy of the compressed algorithms using collected physiological signals from real patients in the Hospital and deploy the solution on commonly used embedded AI chips (i.e., ARM Cortex V8 and Jetson Nano). The proposed framework achieves a 91% seizure detection accuracy using XGBoost, confirming the approach's reliability, practicality, and scalability. △ Less

Submitted 4 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.02658 [pdf]

Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán

Authors: Andrew J. Charlton-Perez, Helen F. Dacre, Simon Driscoll, Suzanne L. Gray, Ben Harvey, Natalie J. Harvey, Kieran M. R. Hunt, Robert W. Lee, Ran**i Swaminathan, Remy Vandaele, Ambrogio Volonté

Abstract: There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and ext… ▽ More There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and extensive damage in Northern Europe, made by machine learning and numerical weather prediction models. The four machine learning models considered (FourCastNet, Pangu-Weather, GraphCast and FourCastNet-v2) produce forecasts that accurately capture the synoptic-scale structure of the cyclone including the position of the cloud head, shape of the warm sector and location of warm conveyor belt jet, and the large-scale dynamical drivers important for the rapid storm development such as the position of the storm relative to the upper-level jet exit. However, their ability to resolve the more detailed structures important for issuing weather warnings is more mixed. All of the machine learning models underestimate the peak amplitude of winds associated with the storm, only some machine learning models resolve the warm core seclusion and none of the machine learning models capture the sharp bent-back warm frontal gradient. Our study shows there is a great deal about the performance and properties of machine learning weather forecasts that can be derived from case studies of high-impact weather events such as Storm Ciarán. △ Less

Submitted 19 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

arXiv:2211.05262 [pdf, other]

Stabilizing Machine Learning Prediction of Dynamics: Noise and Noise-inspired Regularization

Authors: Alexander Wikner, Joseph Harvey, Michelle Girvan, Brian R. Hunt, Andrew Pomerance, Thomas Antonsen, Edward Ott

Abstract: Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model o… ▽ More Recent work has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of unknown chaotic dynamical systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (``climate'') can be produced by employing a feedback loop, whereby the model is trained to predict forward one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this technique can result in artificially rapid error growth. In this article, we systematically examine the technique of adding noise to the ML model input during training to promote stability and improve prediction accuracy. Furthermore, we introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. Our case study uses reservoir computing, a machine-learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto-Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while reservoir computers trained without regularization are unstable. Compared with other regularization techniques that yield stability in some cases, we find that both short-term and climate predictions from reservoir computers trained with noise or with LMNT are substantially more accurate. Finally, we show that the deterministic aspect of our LMNT regularization facilitates fast hyperparameter tuning when compared to training with noise. △ Less

Submitted 12 December, 2022; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: 39 pages, 8 figures, 5 tables

arXiv:2206.00236 [pdf, other]

Continuous Prediction with Experts' Advice

Authors: Victor Sanches Portella, Christopher Liaw, Nicholas J. A. Harvey

Abstract: Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning. In this paper, we employ continuous-time stochastic… ▽ More Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning. In this paper, we employ continuous-time stochastic calculus in order to study the discrete-time experts' problem. We use these tools to design a continuous-time, parameter-free algorithm with improved guarantees for the quantile regret. We then develop an analogous discrete-time algorithm with a very similar analysis and identical quantile regret bounds. Finally, we design an anytime continuous-time algorithm with regret matching the optimal fixed-time rate when the gains are independent Brownian Motions; in many settings, this is the most difficult case. This gives some evidence that, even with adversarial gains, the optimal anytime and fixed-time regrets may coincide. △ Less

Submitted 30 September, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: 30 pages, 1 figure. Version 2 diff: minor edits, reorganization for a journal submission, correct statement of Lemma 5.1 and a better formatted proof of the same lemma

arXiv:2203.07577 [pdf, ps, other]

Efficient and Optimal Fixed-Time Regret with Two Experts

Authors: Laura Greenstreet, Nicholas J. A. Harvey, Victor Sanches Portella

Abstract: Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is known beforehand. Moreover, this is asymptotically optimal when both $T$ and $n$ grow to infinity. However, when the number of experts $n$ is small/fixed, algorithms with b… ▽ More Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is known beforehand. Moreover, this is asymptotically optimal when both $T$ and $n$ grow to infinity. However, when the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist. Cover showed in 1967 a dynamic programming algorithm for the two-experts problem restricted to $\{0,1\}$ costs that suffers at most $\sqrt{T/2π} + O(1)$ regret with $O(T^2)$ pre-processing time. In this work, we propose an optimal algorithm for prediction with two experts' advice that works even for costs in $[0,1]$ and with $O(1)$ processing time per turn. Our algorithm builds up on recent work on the experts problem based on techniques and tools from stochastic calculus. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 29 pages, 13 pages of main text, published in ALT 2022 (PMLR vol. 167)

arXiv:2010.12033 [pdf, ps, other]

Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses

Authors: Yihan Zhou, Victor S. Portella, Mark Schmidt, Nicholas J. A. Harvey

Abstract: In online convex optimization (OCO), Lipschitz continuity of the functions is commonly assumed in order to obtain sublinear regret. Moreover, many algorithms have only logarithmic regret when these functions are also strongly convex. Recently, researchers from convex optimization proposed the notions of "relative Lipschitz continuity" and "relative strong convexity". Both of the notions are genera… ▽ More In online convex optimization (OCO), Lipschitz continuity of the functions is commonly assumed in order to obtain sublinear regret. Moreover, many algorithms have only logarithmic regret when these functions are also strongly convex. Recently, researchers from convex optimization proposed the notions of "relative Lipschitz continuity" and "relative strong convexity". Both of the notions are generalizations of their classical counterparts. It has been shown that subgradient methods in the relative setting have performance analogous to their performance in the classical setting. In this work, we consider OCO for relative Lipschitz and relative strongly convex functions. We extend the known regret bounds for classical OCO algorithms to the relative setting. Specifically, we show regret bounds for the follow the regularized leader algorithms and a variant of online mirror descent. Due to the generality of these methods, these results yield regret bounds for a wide variety of OCO algorithms. Furthermore, we further extend the results to algorithms with extra regularization such as regularized dual averaging. △ Less

Submitted 28 December, 2020; v1 submitted 22 October, 2020; originally announced October 2020.

Comments: 22 pages, camera-ready version, accepted in NeurIPS 2020 for poster presentation. Second version has a new reference (acknowledged in the acknowledgements section) and comments on the relationship of this work with the paper

arXiv:2006.02585 [pdf, other]

Online mirror descent and dual averaging: kee** pace in the dynamic case

Authors: Huang Fang, Nicholas J. A. Harvey, Victor S. Portella, Michael P. Friedlander

Abstract: Online mirror descent (OMD) and dual averaging (DA) -- two fundamental algorithms for online convex optimization -- are known to have very similar (and sometimes identical) performance guarantees when used with a fixed learning rate. Under dynamic learning rates, however, OMD is provably inferior to DA and suffers a linear regret, even in common settings such as prediction with expert advice. We m… ▽ More Online mirror descent (OMD) and dual averaging (DA) -- two fundamental algorithms for online convex optimization -- are known to have very similar (and sometimes identical) performance guarantees when used with a fixed learning rate. Under dynamic learning rates, however, OMD is provably inferior to DA and suffers a linear regret, even in common settings such as prediction with expert advice. We modify the OMD algorithm through a simple technique that we call stabilization. We give essentially the same abstract regret bound for OMD with stabilization and for DA by modifying the classical OMD convergence analysis in a careful and modular way that allows for straightforward and flexible proofs. Simple corollaries of these bounds show that OMD with stabilization and DA enjoy the same performance guarantees in many applications -- even under dynamic learning rates. We also shed light on the similarities between OMD and DA and show simple conditions under which stabilized-OMD and DA generate the same iterates. △ Less

Submitted 3 September, 2021; v1 submitted 3 June, 2020; originally announced June 2020.

Comments: 27 pages main text, 37 pages in total, 1 figure. Version 2: Revision for camera-ready version of ICML 2020, with a new abstract, new discussion and acknowledgements sections, and some other minor modifications. Version 3: Technical report version of JMLR submission, with minor revisions, full proofs, and more details on the setting with composite functions

arXiv:2005.02749 [pdf, other]

doi 10.1016/j.ascom.2020.100382

Introducing PyCross: PyCloudy Rendering Of Shape Software for pseudo 3D ionisation modelling of nebulae

Authors: K. Fitzgerald, E. J Harvey, N. Keaveney, M. Redman

Abstract: Research into the processes of photoionised nebulae plays a significant part in our understanding of stellar evolution. It is extremely difficult to visually represent or model ionised nebula, requiring astronomers to employ sophisticated modelling code to derive temperature, density and chemical composition. Existing codes are available that often require steep learning curves and produce models… ▽ More Research into the processes of photoionised nebulae plays a significant part in our understanding of stellar evolution. It is extremely difficult to visually represent or model ionised nebula, requiring astronomers to employ sophisticated modelling code to derive temperature, density and chemical composition. Existing codes are available that often require steep learning curves and produce models derived from mathematical functions. In this article we will introduce PyCross: PyCloudy Rendering Of Shape Software. This is a pseudo 3D modelling application that generates photoionisation models of optically thin nebulae, created using the Shape software. Currently PyCross has been used for novae and planetary nebulae, and it can be extended to Active Galactic Nuclei or any other type of photoionised axisymmetric nebulae. Functionality, an operational overview, and a scientific pipeline will be described with scenarios where PyCross has been adopted for novae (V5668 Sagittarii (2015) & V4362 Sagittarii (1994)) and a planetary nebula (LoTr1). Unlike the aforementioned photoionised codes this application does not require any coding experience, nor the need to derive complex mathematical models, instead utilising the select features from Cloudy/PyCloudy and Shape. The software was developed using a formal software development lifecycle, written in Python and will work without the need to install any development environments or additional python packages. This application, Shape models and PyCross archive examples are freely available to students, academics and research community on GitHub for download (https://github.com/karolfitzgerald/PyCross_OSX_App). △ Less

Submitted 6 May, 2020; originally announced May 2020.

Comments: 15 pages, 12 figures

Journal ref: Astronomy and Computing, Volume 32, July 2020, 100382

arXiv:2002.08994 [pdf, other]

Optimal anytime regret with two experts

Authors: Nicholas J. A. Harvey, Christopher Liaw, Edwin Perkins, Sikander Randhawa

Abstract: We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optim… ▽ More We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optimal algorithm was previously known in the anytime setting, regardless of the number of experts. Even for the case of two experts, Luo and Schapire have left open the problem of determining the optimal algorithm. We design the first minimax optimal algorithm for minimizing regret in the anytime setting. We consider the case of two experts, and prove that the optimal regret is $γ\sqrt{t} / 2$ at all time steps $t$, where $γ$ is a natural constant that arose 35 years ago in studying fundamental properties of Brownian motion. The algorithm is designed by considering a continuous analogue of the regret problem, which is solved using ideas from stochastic calculus. △ Less

Submitted 26 August, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

Comments: 47 pages, 1 figure

arXiv:2001.08006 [pdf, other]

doi 10.1007/s00454-021-00290-8

Estimating the reach of a manifold via its convexity defect function

Authors: Clément Berenfeld, John Harvey, Marc Hoffmann, Krishnan Shankar

Abstract: The reach of a submanifold is a crucial regularity parameter for manifold learning and geometric inference from point clouds. This paper relates the reach of a submanifold to its convexity defect function. Using the stability properties of convexity defect functions, along with some new bounds and the recent submanifold estimator of Aamari and Levrard [Ann. Statist. 47 177-204 (2019)], an estimato… ▽ More The reach of a submanifold is a crucial regularity parameter for manifold learning and geometric inference from point clouds. This paper relates the reach of a submanifold to its convexity defect function. Using the stability properties of convexity defect functions, along with some new bounds and the recent submanifold estimator of Aamari and Levrard [Ann. Statist. 47 177-204 (2019)], an estimator for the reach is given. A uniform expected loss bound over a C^k model is found. Lower bounds for the minimax rate for estimating the reach over these models are also provided. The estimator almost achieves these rates in the C^3 and C^4 cases, with a gap given by a logarithmic factor. △ Less

Submitted 26 March, 2021; v1 submitted 22 January, 2020; originally announced January 2020.

Comments: 35 pages, 4 figures. Various minor changes in v2 to correct minor errors and/or improve clarity. Thanks to excellent work by peer reviewers, in v3 an error in Lemma 4.9 was rectified, Section 4.2 was substantially revised and other minor changes made throughout and the manuscript was accepted for publication by Discrete & Computational Geometry. Extremely minor changes in v4

MSC Class: 62G05 (Primary) 62C20; 53A07; 53C40 (Secondary)

Journal ref: Discrete & Computational Geometry 67 (2022), 403-438

arXiv:1909.00843 [pdf, other]

Simple and optimal high-probability bounds for strongly-convex stochastic gradient descent

Authors: Nicholas J. A. Harvey, Christopher Liaw, Sikander Randhawa

Abstract: We consider stochastic gradient descent algorithms for minimizing a non-smooth, strongly-convex function. Several forms of this algorithm, including suffix averaging, are known to achieve the optimal $O(1/T)$ convergence rate in expectation. We consider a simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) and prove that it achieves the optimal $O(1/T)$ convergence rate with hig… ▽ More We consider stochastic gradient descent algorithms for minimizing a non-smooth, strongly-convex function. Several forms of this algorithm, including suffix averaging, are known to achieve the optimal $O(1/T)$ convergence rate in expectation. We consider a simple, non-uniform averaging strategy of Lacoste-Julien et al. (2011) and prove that it achieves the optimal $O(1/T)$ convergence rate with high probability. Our proof uses a recently developed generalization of Freedman's inequality. Finally, we compare several of these algorithms experimentally and show that this non-uniform averaging strategy outperforms many standard techniques, and with smaller variance. △ Less

Submitted 2 September, 2019; originally announced September 2019.

arXiv:1905.10444 [pdf, other]

Overt visual attention on rendered 3D objects

Authors: Oleksii Sidorov, Joshua S. Harvey, Hannah E. Smithson, Jon Y. Hardeberg

Abstract: This work covers multiple aspects of overt visual attention on 3D renders: measurement, projection, visualization, and application to studying the influence of material appearance on looking behaviour. In the scope of this work, we ran an eye-tracking experiment in which the observers are presented with animations of rotating 3D objects. The objects were rendered to simulate different metallic app… ▽ More This work covers multiple aspects of overt visual attention on 3D renders: measurement, projection, visualization, and application to studying the influence of material appearance on looking behaviour. In the scope of this work, we ran an eye-tracking experiment in which the observers are presented with animations of rotating 3D objects. The objects were rendered to simulate different metallic appearance, particularly smooth (glossy), rough (matte), and coated gold. The eye-tracking results illustrate how material appearance itself influences the observer's attention, while all the other parameters remain unchanged. In order to make visualization of the attention maps more natural and also make the analysis more accurate, we develop a novel technique of projection of gaze fixations on the 3D surface of the figure itself, instead of the conventional 2D plane of the screen. The proposed methodology will be useful for further studies of attention and saliency in the computer graphics domain. △ Less

Submitted 24 May, 2019; originally announced May 2019.

Comments: Draft submitted to a conference. To be updated

arXiv:1812.08960 [pdf, other]

Lifelong Testing of Smart Autonomous Systems by Shepherding a Swarm of Watchdog Artificial Intelligence Agents

Authors: Hussein Abbass, John Harvey, Kate Yaxley

Abstract: Artificial Intelligence (AI) technologies could be broadly categorised into Analytics and Autonomy. Analytics focuses on algorithms offering perception, comprehension, and projection of knowledge gleaned from sensorial data. Autonomy revolves around decision making, and influencing and sha** the environment through action production. A smart autonomous system (SAS) combines analytics and autonom… ▽ More Artificial Intelligence (AI) technologies could be broadly categorised into Analytics and Autonomy. Analytics focuses on algorithms offering perception, comprehension, and projection of knowledge gleaned from sensorial data. Autonomy revolves around decision making, and influencing and sha** the environment through action production. A smart autonomous system (SAS) combines analytics and autonomy to understand, learn, decide and act autonomously. To be useful, SAS must be trusted and that requires testing. Lifelong learning of a SAS compounds the testing process. In the remote chance that it is possible to fully test and certify the system pre-release, which is theoretically an undecidable problem, it is near impossible to predict the future behaviours that these systems, alone or collectively, will exhibit. While it may be feasible to severely restrict such systems\textquoteright \ learning abilities to limit the potential unpredictability of their behaviours, an undesirable consequence may be severely limiting their utility. In this paper, we propose the architecture for a watchdog AI (WAI) agent dedicated to lifelong functional testing of SAS. We further propose system specifications including a level of abstraction whereby humans shepherd a swarm of WAI agents to oversee an ecosystem made of humans and SAS. The discussion extends to the challenges, pros, and cons of the proposed concept. △ Less

Submitted 21 December, 2018; originally announced December 2018.

arXiv:1812.05217 [pdf, other]

Tight Analyses for Non-Smooth Stochastic Gradient Descent

Authors: Nicholas J. A. Harvey, Christopher Liaw, Yaniv Plan, Sikander Randhawa

Abstract: Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessarily differentiable. We prove that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability. We also construct a function from this class for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/T)$. This s… ▽ More Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessarily differentiable. We prove that after $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/T)$ with high probability. We also construct a function from this class for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/T)$. This shows that the upper bound is tight and that, in this setting, the last iterate of stochastic gradient descent has the same general error rate (with high probability) as deterministic gradient descent. This resolves both open questions posed by Shamir (2012). An intermediate step of our analysis proves that the suffix averaging method achieves error $O(1/T)$ with high probability, which is optimal (for any first-order optimization method). This improves results of Rakhlin (2012) and Hazan and Kale (2014), both of which achieved error $O(1/T)$, but only in expectation, and achieved a high probability error bound of $O(\log \log(T)/T)$, which is suboptimal. We prove analogous results for functions that are Lipschitz and convex, but not necessarily strongly convex or differentiable. After $T$ steps of stochastic gradient descent, the error of the final iterate is $O(\log(T)/\sqrt{T})$ with high probability, and there exists a function for which the error of the final iterate of deterministic gradient descent is $Ω(\log(T)/\sqrt{T})$. △ Less

Submitted 12 December, 2018; originally announced December 2018.

arXiv:1807.02876 [pdf, other]

Machine Learning in High Energy Physics Community White Paper

Authors: Kim Albertsson, Piero Altoe, Dustin Anderson, John Anderson, Michael Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian Bevan, Wahid Bhimji, Daniele Bonacorsi, Bjorn Burkle, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Yi-fan Chen, Taylor Childers, Yann Coadou, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Andrea De Simone , et al. (103 additional authors not shown)

Abstract: Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d… ▽ More Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit. △ Less

Submitted 16 May, 2019; v1 submitted 8 July, 2018; originally announced July 2018.

Comments: Editors: Sergei Gleyzer, Paul Seyfert and Steven Schramm

arXiv:1806.06421 [pdf, ps, other]

Greedy and Local Ratio Algorithms in the MapReduce Model

Authors: Nicholas J. A. Harvey, Christopher Liaw, Paul Liu

Abstract: MapReduce has become the de facto standard model for designing distributed algorithms to process big data on a cluster. There has been considerable research on designing efficient MapReduce algorithms for clustering, graph optimization, and submodular optimization problems. We develop new techniques for designing greedy and local ratio algorithms in this setting. Our randomized local ratio techniq… ▽ More MapReduce has become the de facto standard model for designing distributed algorithms to process big data on a cluster. There has been considerable research on designing efficient MapReduce algorithms for clustering, graph optimization, and submodular optimization problems. We develop new techniques for designing greedy and local ratio algorithms in this setting. Our randomized local ratio technique gives $2$-approximations for weighted vertex cover and weighted matching, and an $f$-approximation for weighted set cover, all in a constant number of MapReduce rounds. Our randomized greedy technique gives algorithms for maximal independent set, maximal clique, and a $(1+ε)\ln Δ$-approximation for weighted set cover. We also give greedy algorithms for vertex colouring with $(1+o(1))Δ$ colours and edge colouring with $(1+o(1))Δ$ colours. △ Less

Submitted 17 June, 2018; originally announced June 2018.

Comments: 16 pages

arXiv:1710.10629 [pdf, other]

Dimensionality reduction methods for molecular simulations

Authors: Stefan Doerr, Igor Ariz-Extreme, Matthew J. Harvey, Gianni De Fabritiis

Abstract: Molecular simulations produce very high-dimensional data-sets with millions of data points. As analysis methods are often unable to cope with so many dimensions, it is common to use dimensionality reduction and clustering methods to reach a reduced representation of the data. Yet these methods often fail to capture the most important features necessary for the construction of a Markov model. Here… ▽ More Molecular simulations produce very high-dimensional data-sets with millions of data points. As analysis methods are often unable to cope with so many dimensions, it is common to use dimensionality reduction and clustering methods to reach a reduced representation of the data. Yet these methods often fail to capture the most important features necessary for the construction of a Markov model. Here we demonstrate the results of various dimensionality reduction methods on two simulation data-sets, one of protein folding and another of protein-ligand binding. The methods tested include a k-means clustering variant, a non-linear auto encoder, principal component analysis and tICA. The dimension-reduced data is then used to estimate the implied timescales of the slowest process by a Markov state model analysis to assess the quality of the projection. The projected dimensions learned from the data are visualized to demonstrate which conformations the various methods choose to represent the molecular process. △ Less

Submitted 2 November, 2017; v1 submitted 29 October, 2017; originally announced October 2017.

Comments: 11 pages, 10 figures

arXiv:1608.02282 [pdf, other]

Computing the Independence Polynomial: from the Tree Threshold down to the Roots

Authors: Nicholas J. A. Harvey, Piyush Srivastava, Jan Vondrák

Abstract: We study an algorithm for approximating the multivariate independence polynomial $Z(\mathbf{z})$, with negative and complex arguments, an object that has strong connections to combinatorics and to statistical physics. In particular, the independence polynomial with negative arguments, $Z(-\mathbf{p})$, determines the Shearer region, the maximal region of probabilities to which the Lovasz Local Lem… ▽ More We study an algorithm for approximating the multivariate independence polynomial $Z(\mathbf{z})$, with negative and complex arguments, an object that has strong connections to combinatorics and to statistical physics. In particular, the independence polynomial with negative arguments, $Z(-\mathbf{p})$, determines the Shearer region, the maximal region of probabilities to which the Lovasz Local Lemma (LLL) can be extended (Shearer 1985). In statistical physics, complex zeros of the independence polynomial relate to existence of phase transitions. Our main result is a deterministic algorithm to compute approximately the independence polynomial in any root-free complex polydisc centered at the origin. Our algorithm is essentially the same as Weitz's algorithm for positive parameters up to the tree uniqueness threshold, and the core of our analysis is a novel multivariate form of the correlation decay technique, which can handle non-uniform complex parameters. In particular, in the univariate real setting our work implies that Weitz's algorithm works in an interval between two critical points $(λ'_c(d), λ_c(d))$, and outside of this interval an approximation of $Z(\mathbf{z})$ is known to be NP-hard. As an application, we give a sub-exponential time algorithm for testing approximate membership in the Shearer region. We also give a new rounding based deterministic algorithm for Shearer's lemma (an extension of the LLL), which, however, runs in sub-exponential time. On the hardness side, we prove that evaluating $Z(\mathbf{z})$ at an arbitrary point in Shearer's region, and testing membership in Shearer's region, are #P-hard problems. We also establish the best possible dependence of the exponent of the run time of Weitz's correlation decay technique in the negative regime on the distance to the boundary of the Shearer region. △ Less

Submitted 11 November, 2017; v1 submitted 7 August, 2016; originally announced August 2016.

Comments: 35 pages. Extended abstract to appear in Proceedings of ACM-SIAM SODA, 2018

arXiv:1307.2274 [pdf, ps, other]

Pipage Rounding, Pessimistic Estimators and Matrix Concentration

Authors: Nicholas J. A. Harvey, Neil Olver

Abstract: Pipage rounding is a dependent random sampling technique that has several interesting properties and diverse applications. One property that has been particularly useful is negative correlation of the resulting vector. Unfortunately negative correlation has its limitations, and there are some further desirable properties that do not seem to follow from existing techniques. In particular, recent co… ▽ More Pipage rounding is a dependent random sampling technique that has several interesting properties and diverse applications. One property that has been particularly useful is negative correlation of the resulting vector. Unfortunately negative correlation has its limitations, and there are some further desirable properties that do not seem to follow from existing techniques. In particular, recent concentration results for sums of independent random matrices are not known to extend to a negatively dependent setting. We introduce a simple but useful technique called concavity of pessimistic estimators. This technique allows us to show concentration of submodular functions and concentration of matrix sums under pipage rounding. The former result answers a question of Chekuri et al. (2009). To prove the latter result, we derive a new variant of Lieb's celebrated concavity theorem in matrix analysis. We provide numerous applications of these results. One is to spectrally-thin trees, a spectral analog of the thin trees that played a crucial role in the recent breakthrough on the asymmetric traveling salesman problem. We show a polynomial time algorithm that, given a graph where every edge has effective conductance at least $κ$, returns an $O(κ^{-1} \cdot \log n / \log \log n)$-spectrally-thin tree. There are further applications to rounding of semidefinite programs, to the column subset selection problem, and to a geometric question of extracting a nearly-orthonormal basis from an isotropic distribution. △ Less

Submitted 8 July, 2013; originally announced July 2013.

arXiv:1202.2624 [pdf, ps, other]

doi 10.1137/120866725

A linear-time algorithm for finding a complete graph minor in a dense graph

Authors: Vida Dujmović, Daniel J. Harvey, Gwenaël Joret, Bruce Reed, David R. Wood

Abstract: Let g(t) be the minimum number such that every graph G with average degree d(G) \geq g(t) contains a K_{t}-minor. Such a function is known to exist, as originally shown by Mader. Kostochka and Thomason independently proved that g(t) \in Θ(t*sqrt{log t}). This article shows that for all fixed ε> 0 and fixed sufficiently large t \geq t(ε), if d(G) \geq (2+ε)g(t) then we can find this K_{t}-minor in… ▽ More Let g(t) be the minimum number such that every graph G with average degree d(G) \geq g(t) contains a K_{t}-minor. Such a function is known to exist, as originally shown by Mader. Kostochka and Thomason independently proved that g(t) \in Θ(t*sqrt{log t}). This article shows that for all fixed ε> 0 and fixed sufficiently large t \geq t(ε), if d(G) \geq (2+ε)g(t) then we can find this K_{t}-minor in linear time. This improves a previous result by Reed and Wood who gave a linear-time algorithm when d(G) \geq 2^{t-2}. △ Less

Submitted 23 April, 2013; v1 submitted 12 February, 2012; originally announced February 2012.

Comments: 6 pages, 0 figures; Clarification added in several places, no change to arguments or results

MSC Class: 05C83; 05C85

Journal ref: SIAM Journal on Discrete Mathematics, 27/4:1770--1774, 2013

arXiv:1107.0088 [pdf, ps, other]

doi 10.1145/2746241

Sparse Sums of Positive Semidefinite Matrices

Authors: Marcel K. de Carli Silva, Nicholas J. A. Harvey, Cristiane M. Sato

Abstract: Recently there has been much interest in "sparsifying" sums of rank one matrices: modifying the coefficients such that only a few are nonzero, while approximately preserving the matrix that results from the sum. Results of this sort have found applications in many different areas, including sparsifying graphs. In this paper we consider the more general problem of sparsifying sums of positive semid… ▽ More Recently there has been much interest in "sparsifying" sums of rank one matrices: modifying the coefficients such that only a few are nonzero, while approximately preserving the matrix that results from the sum. Results of this sort have found applications in many different areas, including sparsifying graphs. In this paper we consider the more general problem of sparsifying sums of positive semidefinite matrices that have arbitrary rank. We give several algorithms for solving this problem. The first algorithm is based on the method of Batson, Spielman and Srivastava (2009). The second algorithm is based on the matrix multiplicative weights update method of Arora and Kale (2007). We also highlight an interesting connection between these two algorithms. Our algorithms have numerous applications. We show how they can be used to construct graph sparsifiers with auxiliary constraints, sparsifiers of hypergraphs, and sparse solutions to semidefinite programs. △ Less

Submitted 17 October, 2011; v1 submitted 30 June, 2011; originally announced July 2011.

arXiv:1008.2159 [pdf, other]

Submodular Functions: Learnability, Structure, and Optimization

Authors: Maria-Florina Balcan, Nicholas J. A. Harvey

Abstract: Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their… ▽ More Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their learnability. In doing so, we uncover several novel structural results revealing ways in which submodular functions can be both surprisingly structured and surprisingly unstructured. We provide several concrete implications of our work in other domains including algorithmic game theory and combinatorial optimization. At a technical level, this research combines ideas from many areas, including learning theory (distributional learning and PAC-style analyses), combinatorics and optimization (matroids and submodular functions), and pseudorandomness (lossless expander graphs). △ Less

Submitted 21 August, 2012; v1 submitted 12 August, 2010; originally announced August 2010.

arXiv:1005.0265 [pdf, ps, other]

Graph Sparsification by Edge-Connectivity and Random Spanning Trees

Authors: Wai Shing Fung, Nicholas J. A. Harvey

Abstract: We present new approaches to constructing graph sparsifiers --- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of $(1 \pm ε)$. Our first approach independently samples each edge $uv$ with probability inversely proportional to the edge-connectivity between $u$ and $v$. The fact that this approach produces a sparsifier resolves a question posed by Ben… ▽ More We present new approaches to constructing graph sparsifiers --- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of $(1 \pm ε)$. Our first approach independently samples each edge $uv$ with probability inversely proportional to the edge-connectivity between $u$ and $v$. The fact that this approach produces a sparsifier resolves a question posed by Benczúr and Karger (2002). Concurrent work of Hariharan and Panigrahi also resolves this question. Our second approach constructs a sparsifier by forming the union of several uniformly random spanning trees. Both of our approaches produce sparsifiers with $O(n \log^2(n)/ε^2)$ edges. Our proofs are based on extensions of Karger's contraction algorithm, which may be of independent interest. △ Less

Submitted 9 August, 2010; v1 submitted 3 May, 2010; originally announced May 2010.

arXiv:1003.2851 [pdf, ps, other]

The complexity of UNO

Authors: Erik D. Demaine, Martin L. Demaine, Nicholas J. A. Harvey, Ryuhei Uehara, Takeaki Uno, Yushi Uno

Abstract: This paper investigates the popular card game UNO from the viewpoint of algorithmic combinatorial game theory. We define simple and concise mathematical models for the game, including both cooperative and uncooperative versions, and analyze their computational complexity. In particular, we prove that even a single-player version of UNO is NP-complete, although some restricted cases are in P. Surpr… ▽ More This paper investigates the popular card game UNO from the viewpoint of algorithmic combinatorial game theory. We define simple and concise mathematical models for the game, including both cooperative and uncooperative versions, and analyze their computational complexity. In particular, we prove that even a single-player version of UNO is NP-complete, although some restricted cases are in P. Surprisingly, we show that the uncooperative two-player version is also in P. △ Less

Submitted 2 December, 2013; v1 submitted 15 March, 2010; originally announced March 2010.

Comments: 13 body pages, 2 appendix pages, 1 table, 7 figures

ACM Class: G.2; F.1

arXiv:0909.0941 [pdf, ps, other]

A Randomized Rounding Algorithm for the Asymmetric Traveling Salesman Problem

Authors: Michel X. Goemans, Nicholas J. A. Harvey, Kamal Jain, Mohit Singh

Abstract: We present an algorithm for the asymmetric traveling salesman problem on instances which satisfy the triangle inequality. Like several existing algorithms, it achieves approximation ratio O(log n). Unlike previous algorithms, it uses randomized rounding. We present an algorithm for the asymmetric traveling salesman problem on instances which satisfy the triangle inequality. Like several existing algorithms, it achieves approximation ratio O(log n). Unlike previous algorithms, it uses randomized rounding. △ Less

Submitted 4 September, 2009; originally announced September 2009.

arXiv:0804.4138 [pdf, ps, other]

Sketching and Streaming Entropy via Approximation Theory

Authors: Nicholas J. A. Harvey, Jelani Nelson, Krzysztof Onak

Abstract: We conclude a sequence of work by giving near-optimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and near-optimal bounds in the insertion-only model without sketching. Our high-level approach is simple: we gi… ▽ More We conclude a sequence of work by giving near-optimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and near-optimal bounds in the insertion-only model without sketching. Our high-level approach is simple: we give algorithms to estimate Renyi and Tsallis entropy, and use them to extrapolate an estimate of Shannon entropy. The accuracy of our estimates is proven using approximation theory arguments and extremal properties of Chebyshev polynomials, a technique which may be useful for other problems. Our work also yields the best-known and near-optimal additive approximations for entropy, and hence also for conditional entropy and mutual information. △ Less

Submitted 25 April, 2008; originally announced April 2008.

arXiv:cs/0601026 [pdf, ps, other]

Algebraic Structures and Algorithms for Matching and Matroid Problems (Preliminary Version)

Authors: Nicholas J. A. Harvey

Abstract: Basic path-matchings, introduced by Cunningham and Geelen (FOCS 1996), are a common generalization of matroid intersection and non-bipartite matching. The main results of this paper are a new algebraic characterization of basic path-matching problems and an algorithm for constructing basic path-matchings in O(n^w) time, where n is the number of vertices and w is the exponent for matrix multiplic… ▽ More Basic path-matchings, introduced by Cunningham and Geelen (FOCS 1996), are a common generalization of matroid intersection and non-bipartite matching. The main results of this paper are a new algebraic characterization of basic path-matching problems and an algorithm for constructing basic path-matchings in O(n^w) time, where n is the number of vertices and w is the exponent for matrix multiplication. Our algorithms are randomized, and our approach assumes that the given matroids are linear and can be represented over the same field. Our main results have interesting consequences for several special cases of path-matching problems. For matroid intersection, we obtain an algorithm with running time O(nr^(w-1))=O(nr^1.38), where the matroids have n elements and rank r. This improves the long-standing bound of O(nr^1.62) due to Gabow and Xu (FOCS 1989). Also, we obtain a simple, purely algebraic algorithm for non-bipartite matching with running time O(n^w). This resolves the central open problem of Mucha and Sankowski (FOCS 2004). △ Less

Submitted 9 January, 2006; originally announced January 2006.

Showing 1–29 of 29 results for author: Harvey, J