Skip to main content

Showing 1–50 of 54 results for author: Graham, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.10442  [pdf, ps, other

    physics.flu-dyn cs.LG

    Data-driven low-dimensional model of a sedimenting flexible fiber

    Authors: Andrew J Fox, Michael D. Graham

    Abstract: The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscou… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  2. arXiv:2312.10235  [pdf, other

    cs.LG nlin.CD

    Building symmetries into data-driven manifold dynamics models for complex flows

    Authors: Carlos E. Pérez De Jesús, Alec J. Linot, Michael D. Graham

    Abstract: Symmetries in a dynamical system provide an opportunity to dramatically improve the performance of data-driven models. For fluid flows, such models are needed for tasks related to design, understanding, prediction, and control. In this work we exploit the symmetries of the Navier-Stokes equations (NSE) and use simulation data to find the manifold where the long-time dynamics live, which has many f… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  3. A 3D generative model of pathological multi-modal MR images and segmentations

    Authors: Virginia Fernandez, Walter Hugo Lopez Pinaya, Pedro Borges, Mark S. Graham, Tom Vercauteren, M. Jorge Cardoso

    Abstract: Generative modelling and synthetic data can be a surrogate for real medical imaging datasets, whose scarcity and difficulty to share can be a nuisance when delivering accurate deep learning models for healthcare applications. In recent years, there has been an increased interest in using these models for data augmentation and synthetic data sharing, using architectures such as generative adversari… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted for publication at the 2023 Deep Generative Models (DGM4MICCAI) MICCAI workshop (Vancouver, Canada)

  4. arXiv:2310.06790  [pdf, other

    cs.LG

    Enhancing Predictive Capabilities in Data-Driven Dynamical Modeling with Automatic Differentiation: Koopman and Neural ODE Approaches

    Authors: C. Ricardo Constante-Amores, Alec J. Linot, Michael D. Graham

    Abstract: Data-driven approximations of the Koopman operator are promising for predicting the time evolution of systems characterized by complex dynamics. Among these methods, the approach known as extended dynamic mode decomposition with dictionary learning (EDMD-DL) has garnered significant attention. Here we present a modification of EDMD-DL that concurrently determines both the dictionary of observables… ▽ More

    Submitted 17 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  5. arXiv:2307.15208  [pdf, other

    eess.IV cs.CV

    Generative AI for Medical Imaging: extending the MONAI Framework

    Authors: Walter H. L. Pinaya, Mark S. Graham, Eric Kerfoot, Petru-Daniel Tudosiu, Jessica Dafflon, Virginia Fernandez, Pedro Sanchez, Julia Wolleb, Pedro F. da Costa, Ashay Patel, Hyung** Chung, Can Zhao, Wei Peng, Zelong Liu, Xueyan Mei, Oeslle Lucena, Jong Chul Ye, Sotirios A. Tsaftaris, Prerna Dogra, Andrew Feng, Marc Modat, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the comp… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  6. arXiv:2307.03777  [pdf, other

    cs.CV

    Unsupervised 3D out-of-distribution detection with latent diffusion models

    Authors: Mark S. Graham, Walter Hugo Lopez Pinaya, Paul Wright, Petru-Daniel Tudosiu, Yee H. Mah, James T. Teo, H. Rolf Jäger, David Werring, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Methods for out-of-distribution (OOD) detection that scale to 3D data are crucial components of any real-world clinical deep learning system. Classic denoising diffusion probabilistic models (DDPMs) have been recently proposed as a robust way to perform reconstruction-based OOD detection on 2D datasets, but do not trivially scale to 3D data. In this work, we propose to use Latent Diffusion Models… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: Accepted at MICCAI 2023

  7. arXiv:2305.01090  [pdf, ps, other

    cs.LG nlin.CD

    Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems

    Authors: Kevin Zeng, Carlos E. Pérez De Jesús, Andrew J. Fox, Michael D. Graham

    Abstract: While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal… ▽ More

    Submitted 6 December, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

  8. arXiv:2305.01071  [pdf, other

    cs.DL

    Right HTML, Wrong JSON: Challenges in Replaying Archived Webpages Built with Client-Side Rendering

    Authors: Michele C. Weigle, Michael L. Nelson, Sawood Alam, Mark Graham

    Abstract: Many web sites are transitioning how they construct their pages. The conventional model is where the content is embedded server-side in the HTML and returned to the client in an HTTP response. Increasingly, sites are moving to a model where the initial HTTP response contains only an HTML skeleton plus JavaScript that makes API calls to a variety of servers for the content (typically in JSON format… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: 20 pages, preprint version of paper accepted at the 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL)

  9. arXiv:2301.12098  [pdf, other

    physics.flu-dyn cs.LG

    Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning

    Authors: Alec J. Linot, Kevin Zeng, Michael D. Graham

    Abstract: The high dimensionality and complex dynamics of turbulent flows remain an obstacle to the discovery and implementation of control strategies. Deep reinforcement learning (RL) is a promising avenue for overcoming these obstacles, but requires a training phase in which the RL agent iteratively interacts with the flow environment to learn a control policy, which can be prohibitively expensive when th… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

  10. arXiv:2301.04638  [pdf, other

    physics.flu-dyn cs.LG

    Dynamics of a data-driven low-dimensional model of turbulent minimal Couette flow

    Authors: Alec J. Linot, Michael D. Graham

    Abstract: Because the Navier-Stokes equations are dissipative, the long-time dynamics of a flow in state space are expected to collapse onto a manifold whose dimension may be much lower than the dimension required for a resolved simulation. On this manifold, the state of the system can be exactly described in a coordinate system parameterizing the manifold. Describing the system in this low-dimensional coor… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

  11. arXiv:2212.01493  [pdf

    astro-ph.IM cs.AI cs.LG

    Applications of AI in Astronomy

    Authors: S. G. Djorgovski, A. A. Mahabal, M. J. Graham, K. Polsterer, A. Krone-Martins

    Abstract: We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with bi… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 12 pages, 1 figure, an invited review chapter, to appear in: Artificial Intelligence for Science, eds. A. Choudhary, G. Fox and T. Hey, Singapore: World Scientific, in press (2023)

  12. Deep learning delay coordinate dynamics for chaotic attractors from partial observable data

    Authors: Charles D. Young, Michael D. Graham

    Abstract: A common problem in time series analysis is to predict dynamics with only scalar or partial observations of the underlying dynamical system. For data on a smooth compact manifold, Takens theorem proves a time delayed embedding of the partial state is diffeomorphic to the attractor, although for chaotic and highly nonlinear systems learning these delay coordinate map**s is challenging. We utilize… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  13. arXiv:2211.07740  [pdf, other

    cs.LG cs.CV

    Denoising diffusion models for out-of-distribution detection

    Authors: Mark S. Graham, Walter H. L. Pinaya, Petru-Daniel Tudosiu, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Out-of-distribution detection is crucial to the safe deployment of machine learning systems. Currently, unsupervised out-of-distribution detection is dominated by generative-based approaches that make use of estimates of the likelihood or other measurements from a generative model. Reconstruction-based methods offer an alternative approach, in which a measure of reconstruction error is used to det… ▽ More

    Submitted 20 April, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

  14. Data-driven low-dimensional dynamic model of Kolmogorov flow

    Authors: Carlos E. Pérez De Jesús, Michael D. Graham

    Abstract: Reduced order models (ROMs) that capture flow dynamics are of interest for decreasing computational costs for simulation as well as for model-based control approaches. This work presents a data-driven framework for minimal-dimensional models that effectively capture the dynamics and properties of the flow. We apply this to Kolmogorov flow in a regime consisting of chaotic and intermittent behavior… ▽ More

    Submitted 1 August, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

    Journal ref: Phys. Rev. Fluids 8, 044402 (2023)

  15. Can segmentation models be trained with fully synthetically generated data?

    Authors: Virginia Fernandez, Walter Hugo Lopez Pinaya, Pedro Borges, Petru-Daniel Tudosiu, Mark S Graham, Tom Vercauteren, M Jorge Cardoso

    Abstract: In order to achieve good performance and generalisability, medical image segmentation models should be trained on sizeable datasets with sufficient variability. Due to ethics and governance restrictions, and the costs associated with labelling data, scientific development is often stifled, with models trained and tested on limited data. Data augmentation is often used to artificially increase the… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: 12 pages, 2 (+2 App.) figures, 3 tables. Accepted at Simulation and Synthesis in Medical Imaging workshop (MICCAI 2022)

  16. arXiv:2209.03177  [pdf, other

    eess.IV cs.CV cs.LG

    Morphology-preserving Autoregressive 3D Generative Modelling of the Brain

    Authors: Petru-Daniel Tudosiu, Walter Hugo Lopez Pinaya, Mark S. Graham, Pedro Borges, Virginia Fernandez, Dai Yang, Jeremy Appleyard, Guido Novati, Disha Mehra, Mike Vella, Parashkev Nachev, Sebastien Ourselin, Jorge Cardoso

    Abstract: Human anatomy, morphology, and associated diseases can be studied using medical imaging data. However, access to medical imaging data is restricted by governance and privacy concerns, data ownership, and the cost of acquisition, thus limiting our ability to understand the human body. A possible solution to this issue is the creation of a model able to learn and then generate synthetic images of th… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Comments: 13 pages, 3 figures, 2 tables, accepted at SASHIMI MICCAI 2022

    MSC Class: 68T99 (Primary) 92C55 (Secondary) ACM Class: I.2.1; J.3

  17. arXiv:2207.09060  [pdf, other

    physics.ed-ph cs.LG hep-ex physics.comp-ph

    Data Science and Machine Learning in Education

    Authors: Gabriele Benelli, Thomas Y. Chen, Javier Duarte, Matthew Feickert, Matthew Graham, Lindsey Gray, Dan Hackett, Phil Harris, Shih-Chieh Hsu, Gregor Kasieczka, Elham E. Khoda, Matthias Komm, Mia Liu, Mark S. Neubauer, Scarlet Norberg, Alexx Perloff, Marcel Rieger, Claire Savard, Kazuhiro Terao, Savannah Thais, Avik Roy, Jean-Roch Vlimant, Grigorios Chachamis

    Abstract: The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit gr… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: Contribution to Snowmass 2021

  18. arXiv:2206.03461  [pdf, other

    cs.CV eess.IV q-bio.QM

    Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models

    Authors: Walter H. L. Pinaya, Mark S. Graham, Robert Gray, Pedro F Da Costa, Petru-Daniel Tudosiu, Paul Wright, Yee H. Mah, Andrew D. MacKinnon, James T. Teo, Rolf Jager, David Werring, Geraint Rees, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Deep generative models have emerged as promising tools for detecting arbitrary anomalies in data, dispensing with the necessity for manual labelling. Recently, autoregressive transformers have achieved state-of-the-art performance for anomaly detection in medical imaging. Nonetheless, these models still have some intrinsic weaknesses, such as requiring images to be modelled as 1D sequences, the ac… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  19. arXiv:2205.10650  [pdf, other

    cs.CV cs.LG

    Transformer-based out-of-distribution detection for clinically safe segmentation

    Authors: Mark S Graham, Petru-Daniel Tudosiu, Paul Wright, Walter Hugo Lopez Pinaya, U Jean-Marie, Yee Mah, James Teo, Rolf H Jäger, David Werring, Parashkev Nachev, Sebastien Ourselin, M Jorge Cardoso

    Abstract: In a clinical setting it is essential that deployed image processing systems are robust to the full range of inputs they might encounter and, in particular, do not make confidently wrong predictions. The most popular approach to safe processing is to train networks that can provide a measure of their uncertainty, but these tend to fail for inputs that are far outside the training data distribution… ▽ More

    Submitted 17 May, 2023; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted at MIDL 2022 (Oral)

  20. arXiv:2205.03204  [pdf, other

    cs.CG

    Bounded-degree plane geometric spanners in practice

    Authors: Frederick Anderson, Anirban Ghosh, Matthew Graham, Lucas Mougeot, David Wisnosky

    Abstract: The construction of bounded-degree plane geometric spanners has been a focus of interest since 2002 when Bose, Gudmundsson, and Smid proposed the first algorithm to construct such spanners. To date, eleven algorithms have been designed with various trade-offs in degree and stretch factor. We have implemented these sophisticated algorithms in C++ using the CGAL library and experimented with them us… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

  21. arXiv:2205.01716  [pdf, other

    cs.CG

    Experiments with Unit Disk Cover Algorithms for Covering Massive Pointsets

    Authors: Rachel Friederich, Matthew Graham, Anirban Ghosh, Brian Hicks, Ronald Shevchenko

    Abstract: Given a set of $n$ points in the plane, the Unit Disk Cover (UDC) problem asks to compute the minimum number of unit disks required to cover the points, along with a placement of the disks. The problem is NP-hard and several approximation algorithms have been designed over the last three decades. In this paper, we have engineered and experimentally compared practical performances of some of these… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

  22. Data-driven control of spatiotemporal chaos with reduced-order neural ODE-based models and reinforcement learning

    Authors: Kevin Zeng, Alec J. Linot, Michael D. Graham

    Abstract: Deep reinforcement learning (RL) is a data-driven method capable of discovering complex control strategies for high-dimensional systems, making it promising for flow control applications. In particular, the present work is motivated by the goal of reducing energy dissipation in turbulent flows, and the example considered is the spatiotemporally chaotic dynamics of the Kuramoto-Sivashinsky equation… ▽ More

    Submitted 1 May, 2022; originally announced May 2022.

  23. Stabilized Neural Ordinary Differential Equations for Long-Time Forecasting of Dynamical Systems

    Authors: Alec J. Linot, Joshua W. Burby, Qi Tang, Prasanna Balaprakash, Michael D. Graham, Romit Maulik

    Abstract: In data-driven modeling of spatiotemporal phenomena careful consideration often needs to be made in capturing the dynamics of the high wavenumbers. This problem becomes especially challenging when the system of interest exhibits shocks or chaotic dynamics. We present a data-driven modeling method that accurately captures shocks and chaotic dynamics by proposing a novel architecture, stabilized neu… ▽ More

    Submitted 3 October, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  24. arXiv:2109.08445  [pdf, other

    cs.HC cs.CR

    Develo** Visualisations to Enhance an Insider Threat Product: A Case Study

    Authors: Martin Graham, Robert Kukla, Oleksii Mandrychenko, Darren Hart, Jessie Kennedy

    Abstract: This paper describes the process of develo** data visualisations to enhance a commercial software platform for combating insider threat, whose existing UI, while perfectly functional, was limited in its ability to allow analysts to easily spot the patterns and outliers that visualisation naturally reveals. We describe the design and development process, proceeding from initial tasks/requirements… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: VizSec 2021

    ACM Class: H.5.2

  25. arXiv:2109.00060  [pdf, other

    cs.LG nlin.CD

    Data-Driven Reduced-Order Modeling of Spatiotemporal Chaos with Neural Ordinary Differential Equations

    Authors: Alec J. Linot, Michael D. Graham

    Abstract: Dissipative partial differential equations that exhibit chaotic dynamics tend to evolve to attractors that exist on finite-dimensional manifolds. We present a data-driven reduced order modeling method that capitalizes on this fact by finding the coordinates of this manifold and finding an ordinary differential equation (ODE) describing the dynamics in this coordinate system. The manifold coordinat… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

  26. Data-driven discovery of intrinsic dynamics

    Authors: Daniel Floryan, Michael D. Graham

    Abstract: Dynamical models underpin our ability to understand and predict the behavior of natural systems. Whether dynamical models are developed from first-principles derivations or from observational data, they are predicated on our choice of state variables. The choice of state variables is driven by convenience and intuition, and in the data-driven case the observed variables are often chosen to be the… ▽ More

    Submitted 14 June, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

    Comments: 27 pages + 15 pages Supplementary Information

    Journal ref: Nature Machine Intelligence, Vol. 4, Issue 12, pp. 1113-1120 (2022)

  27. arXiv:2107.10695  [pdf, other

    cs.IT math.PR

    Low latency allcast over broadcast erasure channels

    Authors: Mark A. Graham, Ayalvadi J. Ganesh, Robert J. Piechocki

    Abstract: Consider n nodes communicating over an unreliable broadcast channel. Each node has a single packet that needs to be communicated to all other nodes. Time is slotted, and a time slot is long enough for each node to broadcast one packet. Each broadcast reaches a random subset of nodes. The objective is to minimise the time until all nodes have received all packets. We study two schemes, (i) random r… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: Submitted to IEEE Transactions on Information Theory

    MSC Class: 94A05 94A15

  28. Symmetry reduction for deep reinforcement learning active control of chaotic spatiotemporal dynamics

    Authors: Kevin Zeng, Michael D. Graham

    Abstract: Deep reinforcement learning (RL) is a data-driven, model-free method capable of discovering complex control strategies for macroscopic objectives in high-dimensional systems, making its application towards flow control promising. Many systems of flow control interest possess symmetries that, when neglected, can significantly inhibit the learning and performance of a naive deep RL approach. Using a… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: Submitted to Physical Review E

    Journal ref: Phys. Rev. E 104, 014210 (2021)

  29. arXiv:2103.08201  [pdf, other

    cs.CV cs.AI

    Geometric Change Detection in Digital Twins using 3D Machine Learning

    Authors: Tiril Sundby, Julia Maria Graham, Adil Rasheed, Mandar Tabib, Omer San

    Abstract: Digital twins are meant to bridge the gap between real-world physical systems and virtual representations. Both stand-alone and descriptive digital twins incorporate 3D geometric models, which are the physical representations of objects in the digital replica. Digital twin applications are required to rapidly update internal parameters with the evolution of their physical counterpart. Due to an es… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

  30. arXiv:2102.13352  [pdf, other

    astro-ph.IM astro-ph.EP cs.LG

    Tails: Chasing Comets with the Zwicky Transient Facility and Deep Learning

    Authors: Dmitry A. Duev, Bryce T. Bolin, Matthew J. Graham, Michael S. P. Kelley, Ashish Mahabal, Eric C. Bellm, Michael W. Coughlin, Richard Dekany, George Helou, Shrinivas R. Kulkarni, Frank J. Masci, Thomas A. Prince, Reed Riddle, Maayane T. Soumagnac, Stéfan J. van der Walt

    Abstract: We present Tails, an open-source deep-learning framework for the identification and localization of comets in the image data of the Zwicky Transient Facility (ZTF), a robotic optical time-domain survey currently in operation at the Palomar Observatory in California, USA. Tails employs a custom EfficientDet-based architecture and is capable of finding comets in single images in near real time, rath… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

  31. arXiv:2011.00867  [pdf, other

    cs.DB cs.IR

    Accessible Data Curation and Analytics for International-Scale Citizen Science Datasets

    Authors: Benjamin Murray, Eric Kerfoot, Mark S. Graham, Carole H. Sudre, Erika Molteni, Liane S. Canas, Michela Antonelli, Kerstin Klaser, Alessia Visconti, Andrew T. Chan, Paul W. Franks, Richard Davies, Jonathan Wolf, Tim Spector, Claire J. Steves, Marc Modat, Sebastien Ourselin

    Abstract: The Covid Symptom Study, a smartphone-based surveillance study on COVID-19 symptoms in the population, is an exemplar of big data citizen science. Over 4.7 million participants and 189 million unique assessments have been logged since its introduction in March 2020. The success of the Covid Symptom Study creates technical challenges around effective data curation for two reasons. Firstly, the scal… ▽ More

    Submitted 17 February, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    ACM Class: D.m; E.2; H.3.3; I.7

  32. arXiv:2010.01926  [pdf, other

    eess.IV cs.CV

    Test-time Unsupervised Domain Adaptation

    Authors: Thomas Varsavsky, Mauricio Orbes-Arteaga, Carole H. Sudre, Mark S. Graham, Parashkev Nachev, M. Jorge Cardoso

    Abstract: Convolutional neural networks trained on publicly available medical imaging datasets (source domain) rarely generalise to different scanners or acquisition protocols (target domain). This motivates the active field of domain adaptation. While some approaches to the problem require labeled data from the target domain, others adopt an unsupervised approach to domain adaptation (UDA). Evaluating UDA… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted at MICCAI 2020

  33. arXiv:2009.07573  [pdf, other

    cs.CV

    Hierarchical brain parcellation with uncertainty

    Authors: Mark S. Graham, Carole H. Sudre, Thomas Varsavsky, Petru-Daniel Tudosiu, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Many atlases used for brain parcellation are hierarchically organised, progressively dividing the brain into smaller sub-regions. However, state-of-the-art parcellation methods tend to ignore this structure and treat labels as if they are `flat'. We introduce a hierarchically-aware brain parcellation method that works by predicting the decisions at each branch in the label tree. We further show ho… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

    Comments: To be published in the MICCAI 2020 workshop: Uncertainty for Safe Utilization of Machine Learning in Medical Imaging

  34. arXiv:2002.05692  [pdf, other

    eess.IV cs.CV q-bio.QM

    Neuromorphologicaly-preserving Volumetric data encoding using VQ-VAE

    Authors: Petru-Daniel Tudosiu, Thomas Varsavsky, Richard Shaw, Mark Graham, Parashkev Nachev, Sebastien Ourselin, Carole H. Sudre, M. Jorge Cardoso

    Abstract: The increasing efficiency and compactness of deep learning architectures, together with hardware improvements, have enabled the complex and high-dimensional modelling of medical volumetric data at higher resolutions. Recently, Vector-Quantised Variational Autoencoders (VQ-VAE) have been proposed as an efficient generative unsupervised learning approach that can encode images to a small percentage… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  35. arXiv:2002.00854  [pdf

    cs.SI cs.CY physics.soc-ph

    Measuring relative opinion from location-based social media: A case study of the 2016 U.S. presidential election

    Authors: Zhaoya Gong, Tengteng Cai, Jean-Claude Thill, Scott Hale, Mark Graham

    Abstract: Social media has become an emerging alternative to opinion polls for public opinion collection, while it is still posing many challenges as a passive data source, such as structurelessness, quantifiability, and representativeness. Social media data with geotags provide new opportunities to unveil the geographic locations of users expressing their opinions. This paper aims to answer two questions:… ▽ More

    Submitted 20 April, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

    Journal ref: PLoS ONE 15(5): e0233660 (2020)

  36. arXiv:2001.04263  [pdf, other

    cs.LG physics.flu-dyn

    Deep learning to discover and predict dynamics on an inertial manifold

    Authors: Alec J. Linot, Michael D. Graham

    Abstract: A data-driven framework is developed to represent chaotic dynamics on an inertial manifold (IM), and applied to solutions of the Kuramoto-Sivashinsky equation. A hybrid method combining linear and nonlinear (neural-network) dimension reduction transforms between coordinates in the full state space and on the IM. Additional neural networks predict time-evolution on the IM. The formalism accounts fo… ▽ More

    Submitted 21 May, 2020; v1 submitted 20 December, 2019; originally announced January 2020.

    Comments: Accepted in Physical Review E

    Journal ref: Phys. Rev. E 101, 062209 (2020)

  37. arXiv:1911.11779  [pdf, other

    gr-qc astro-ph.HE astro-ph.IM cs.LG

    Enabling real-time multi-messenger astrophysics discoveries with deep learning

    Authors: E. A. Huerta, Gabrielle Allen, Igor Andreoni, Javier M. Antelis, Etienne Bachelet, Bruce Berriman, Federica Bianco, Rahul Biswas, Matias Carrasco, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Maya Fishbach, Francisco Förster, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Robert Gruendl, Anushri Gupta, Roland Haas, Sarah Habib, Elise Jennings, Margaret W. G. Johnson , et al. (35 additional authors not shown)

    Abstract: Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravit… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: Invited Expert Recommendation for Nature Reviews Physics. The art work produced by E. A. Huerta and Shawn Rosofsky for this article was used by Carl Conway to design the cover of the October 2019 issue of Nature Reviews Physics

    Journal ref: Nature Reviews Physics volume 1, pages 600-608 (2019)

  38. arXiv:1909.06429  [pdf, other

    stat.ML cs.IR cs.LG

    Recommendation or Discrimination?: Quantifying Distribution Parity in Information Retrieval Systems

    Authors: Rinat Khaziev, Bryce Casavant, Pearce Washabaugh, Amy A. Winecoff, Matthew Graham

    Abstract: Information retrieval (IR) systems often leverage query data to suggest relevant items to users. This introduces the possibility of unfairness if the query (i.e., input) and the resulting recommendations unintentionally correlate with latent factors that are protected variables (e.g., race, gender, and age). For instance, a visual search system for fashion recommendations may pick up on features o… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

  39. arXiv:1907.05164  [pdf

    eess.IV cs.CV cs.LG

    Disease classification of macular Optical Coherence Tomography scans using deep learning software: validation on independent, multi-centre data

    Authors: Kanwal K. Bhatia, Mark S. Graham, Louise Terry, Ashley Wood, Paris Tranos, Sameer Trikha, Nicolas Jaccard

    Abstract: Purpose: To evaluate Pegasus-OCT, a clinical decision support software for the identification of features of retinal disease from macula OCT scans, across heterogenous populations involving varying patient demographics, device manufacturers, acquisition sites and operators. Methods: 5,588 normal and anomalous macular OCT volumes (162,721 B-scans), acquired at independent centres in five countrie… ▽ More

    Submitted 11 July, 2019; originally announced July 2019.

  40. arXiv:1905.06457  [pdf, other

    cs.SI cs.CV q-bio.MN

    An interdisciplinary survey of network similarity methods

    Authors: Emily Evans, Marissa Graham

    Abstract: Comparative graph and network analysis play an important role in both systems biology and pattern recognition, but existing surveys on the topic have historically ignored or underserved one or the other of these fields. We present an integrative introduction to the key objectives and methods of graph and network comparison in each field, with the intent of remaining accessible to relative novices… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

  41. arXiv:1902.00522  [pdf, ps, other

    astro-ph.IM astro-ph.HE cs.LG gr-qc

    Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data Era

    Authors: Gabrielle Allen, Igor Andreoni, Etienne Bachelet, G. Bruce Berriman, Federica B. Bianco, Rahul Biswas, Matias Carrasco Kind, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Anushri Gupta, Roland Haas, E. A. Huerta, Elise Jennings, Daniel S. Katz, Asad Khan, Volodymyr Kindratenko, William T. C. Kramer, Xin Liu, Ashish Mahabal , et al. (23 additional authors not shown)

    Abstract: This report provides an overview of recent work that harnesses the Big Data Revolution and Large Scale Computing to address grand computational challenges in Multi-Messenger Astrophysics, with a particular emphasis on real-time discovery campaigns. Acknowledging the transdisciplinary nature of Multi-Messenger Astrophysics, this document has been prepared by members of the physics, astronomy, compu… ▽ More

    Submitted 1 February, 2019; originally announced February 2019.

    Comments: 15 pages, no figures. White paper based on the "Deep Learning for Multi-Messenger Astrophysics: Real-time Discovery at Scale" workshop, hosted at NCSA, October 17-19, 2018 http://www.ncsa.illinois.edu/Conferences/DeepLearningLSST/

  42. Platform Criminalism: The 'Last-Mile' Geography of the Darknet Market Supply Chain

    Authors: Martin Dittus, Joss Wright, Mark Graham

    Abstract: Does recent growth of darknet markets signify a slow reorganisation of the illicit drug trade? Where are darknet markets situated in the global drug supply chain? In principle, these platforms allow producers to sell directly to end users, bypassing traditional trafficking routes. And yet, there is evidence that many offerings originate from a small number of highly active consumer countries, rath… ▽ More

    Submitted 11 March, 2018; v1 submitted 28 December, 2017; originally announced December 2017.

    ACM Class: K.4.1, K.4.4

  43. Deep-Learnt Classification of Light Curves

    Authors: Ashish Mahabal, Kshiteej Sheth, Fabian Gieseke, Akshay Pai, S. George Djorgovski, Andrew Drake, Matthew Graham, the CSS/CRTS/PTF Collaboration

    Abstract: Astronomy light curves are sparse, gappy, and heteroscedastic. As a result standard time series methods regularly used for financial and similar datasets are of little help and astronomers are usually left to their own instruments and techniques to classify light curves. A common approach is to derive statistical features from the time series and to use machine learning methods, generally supervis… ▽ More

    Submitted 19 September, 2017; originally announced September 2017.

    Comments: 8 pages, 9 figures, 6 tables, 2 listings. Accepted to 2017 IEEE Symposium Series on Computational Intelligence (SSCI)

    Journal ref: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 2017, p2757

  44. arXiv:1605.02688  [pdf, other

    cs.SC cs.LG cs.MS

    Theano: A Python framework for fast computation of mathematical expressions

    Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

    Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More

    Submitted 9 May, 2016; originally announced May 2016.

    Comments: 19 pages, 5 figures

  45. Software search is not a science, even among scientists: A survey of how scientists and engineers find software

    Authors: Michael Hucka, Matthew J. Graham

    Abstract: Improved software discovery is a prerequisite for greater software reuse: after all, if someone cannot find software for a particular task, they cannot reuse it. Understanding people's approaches and preferences when they look for software could help improve facilities for software discovery. We surveyed people working in several scientific and engineering fields to better understand their approac… ▽ More

    Submitted 28 May, 2018; v1 submitted 7 May, 2016; originally announced May 2016.

    Journal ref: Journal of Systems and Software, 141:171-191, 2018

  46. Real-Time Data Mining of Massive Data Streams from Synoptic Sky Surveys

    Authors: S. G. Djorgovski, M. J. Graham, C. Donalek, A. A. Mahabal, A. J. Drake, M. Turmon, T. Fuchs

    Abstract: The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optim… ▽ More

    Submitted 17 January, 2016; originally announced January 2016.

    Comments: 14 pages, an invited paper for a special issue of Future Generation Computer Systems, Elsevier Publ. (2015). This is an expanded version of a paper arXiv:1407.3502 presented at the IEEE e-Science 2014 conf., with some new content

  47. Immersive and Collaborative Data Visualization Using Virtual Reality Platforms

    Authors: Ciro Donalek, S. G. Djorgovski, Scott Davidoff, Alex Cioc, Anwell Wang, Giuseppe Longo, Jeffrey S. Norris, Jerry Zhang, Elizabeth Lawler, Stacy Yeh, Ashish Mahabal, Matthew Graham, Andrew Drake

    Abstract: Effective data visualization is a key part of the discovery process in the era of big data. It is the bridge between the quantitative content of the data and human intuition, and thus an essential component of the scientific path from data into knowledge and understanding. Visualization is also essential in the data mining process, directing the choice of the applicable algorithms, and in hel**… ▽ More

    Submitted 28 October, 2014; originally announced October 2014.

    Comments: 6 pages, refereed proceedings of 2014 IEEE International Conference on Big Data, page 609, ISBN 978-1-4799-5665-4

  48. arXiv:1407.3692  [pdf

    cs.HC

    Helium: Visualization of Large Scale Plant Pedigrees

    Authors: Paul D. Shaw, Martin Graham, Jessie Kennedy, Iain Milne, David F. Marshall

    Abstract: Background: Plant breeders are utilising an increasingly diverse range of data types in order to identify lines that have desirable characteristics which are suitable to be taken forward in plant breeding programmes. There are a number of key morphological and physiological traits such as disease resistance and yield that are required to be maintained, and improved upon if a commercial variety is… ▽ More

    Submitted 11 July, 2014; originally announced July 2014.

    Comments: BioVis 2014 conference

  49. Feature Selection Strategies for Classifying High Dimensional Astronomical Data Sets

    Authors: Ciro Donalek, Arun Kumar A., S. G. Djorgovski, Ashish A. Mahabal, Matthew J. Graham, Thomas J. Fuchs, Michael J. Turmon, N. Sajeeth Philip, Michael Ting-Chang Yang, Giuseppe Longo

    Abstract: The amount of collected data in many scientific fields is increasing, all of them requiring a common task: extract knowledge from massive, multi parametric data sets, as rapidly and efficiently possible. This is especially true in astronomy where synoptic sky surveys are enabling new research frontiers in the time domain astronomy and posing several new object classification challenges in multi di… ▽ More

    Submitted 7 October, 2013; originally announced October 2013.

    Comments: 7 pages, to appear in refereed proceedings of Scalable Machine Learning: Theory and Applications, IEEE BigData 2013

  50. Where in the World are You? Geolocation and Language Identification in Twitter

    Authors: Mark Graham, Scott A. Hale, Devin Gaffney

    Abstract: The movements of ideas and content between locations and languages are unquestionably crucial concerns to researchers of the information age, and Twitter has emerged as a central, global platform on which hundreds of millions of people share knowledge and information. A variety of research has attempted to harvest locational and linguistic metadata from tweets in order to understand important ques… ▽ More

    Submitted 3 August, 2013; originally announced August 2013.