-
Universal Functional Regression with Neural Operator Flows
Authors:
Yaozhong Shi,
Angela F. Gao,
Zachary E. Ross,
Kamyar Azizzadenesheli
Abstract:
Regression on function spaces is typically limited to models with Gaussian process priors. We introduce the notion of universal functional regression, in which we aim to learn a prior distribution over non-Gaussian function spaces that remains mathematically tractable for functional regression. To do this, we develop Neural Operator Flows (OpFlow), an infinite-dimensional extension of normalizing…
▽ More
Regression on function spaces is typically limited to models with Gaussian process priors. We introduce the notion of universal functional regression, in which we aim to learn a prior distribution over non-Gaussian function spaces that remains mathematically tractable for functional regression. To do this, we develop Neural Operator Flows (OpFlow), an infinite-dimensional extension of normalizing flows. OpFlow is an invertible operator that maps the (potentially unknown) data function space into a Gaussian process, allowing for exact likelihood estimation of functional point evaluations. OpFlow enables robust and accurate uncertainty quantification via drawing posterior samples of the Gaussian process and subsequently map** them into the data function space. We empirically study the performance of OpFlow on regression and generation tasks with data generated from Gaussian processes with known posterior forms and non-Gaussian processes, as well as real-world earthquake seismograms with an unknown closed-form distribution.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Insights on the dip of fault zones in Southern California from modeling of seismicity with anisotropic point processes
Authors:
Zachary E. Ross
Abstract:
Accurate models of fault zone geometry are important for scientific and hazard applications. While seismicity can provide high-resolution point measurements of fault geometry, extrapolating these measurements to volumes may involve making strong assumptions. This is particularly problematic in distributed fault zones, which are commonly observed in immature faulting regions. In this study, we focu…
▽ More
Accurate models of fault zone geometry are important for scientific and hazard applications. While seismicity can provide high-resolution point measurements of fault geometry, extrapolating these measurements to volumes may involve making strong assumptions. This is particularly problematic in distributed fault zones, which are commonly observed in immature faulting regions. In this study, we focus on characterizing the dip of fault zones in Southern California with the goal of improving fault models. We introduce a novel technique from spatial point process theory to quantify the orientation of persistent surficial features in seismicity, even when embedded in wide shear zones. The technique makes relatively mild assumptions about fault geometry and is formulated with the goal of determining the dip of a fault zone at depth. The method is applied to 11 prominent seismicity regions in Southern California. Overall, the results compare favorably with the geometry models provided by the SCEC Community Fault Model and other focused regional studies. More specifically, we find evidence that the Southern San Andreas and San Jacinto fault zones are both northeast dip** at seismogenic depths at the length scales of 1.0-4.0 km. In addition, we find more limited evidence for some depth dependent variations in dip that suggest a listric geometry. The developed technique can provide an independent source of information from seismicity to augment existing fault geometry models.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Deep Neural Helmholtz Operators for 3D Elastic Wave Propagation and Inversion
Authors:
Caifeng Zou,
Kamyar Azizzadenesheli,
Zachary E. Ross,
Robert W. Clayton
Abstract:
Numerical simulations of seismic wave propagation in heterogeneous 3D media are central to investigating subsurface structures and understanding earthquake processes, yet are computationally expensive for large problems. This is particularly problematic for full waveform inversion, which typically involves numerous runs of the forward process. In machine learning there has been considerable recent…
▽ More
Numerical simulations of seismic wave propagation in heterogeneous 3D media are central to investigating subsurface structures and understanding earthquake processes, yet are computationally expensive for large problems. This is particularly problematic for full waveform inversion, which typically involves numerous runs of the forward process. In machine learning there has been considerable recent work in the area of operator learning, with a new class of models called neural operators allowing for data-driven solutions to partial differential equations. Recent works in seismology have shown that when neural operators are adequately trained, they can significantly shorten the compute time for wave propagation. However, the memory required for the 3D time domain equations may be prohibitive. In this study, we show that these limitations can be overcome by solving the wave equations in the frequency domain, also known as the Helmholtz equations, since the solutions for a set of frequencies can be determined in parallel. The 3D Helmholtz neural operator is 40 times more memory-efficient than an equivalent time-domain version. We employ a U-shaped neural operator for 2D and 3D elastic wave modeling, achieving two orders of magnitude acceleration compared to a baseline spectral element method. The neural operator accurately generalizes to variable velocity structures and can be evaluated on denser input meshes than used in the training simulations. We also show that when solving for wavefields strictly on the surface, the accuracy can be significantly improved via a graph neural operator layer. In leveraging automatic differentiation, the proposed method can serve as an alternative to the adjoint-state approach for 3D full-waveform inversion, reducing the computation time by a factor of 350.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Broadband Ground Motion Synthesis via Generative Adversarial Neural Operators: Development and Validation
Authors:
Yaozhong Shi,
Grigorios Lavrentiadis,
Domniki Asimaki,
Zachary E. Ross,
Kamyar Azizzadenesheli
Abstract:
We present a data-driven framework for ground-motion synthesis that generates three-component acceleration time histories conditioned on moment magnitude, rupture distance , time-average shear-wave velocity at the top $30m$ ($V_{S30}$), and style of faulting. We use a Generative Adversarial Neural Operator (GANO), a resolution invariant architecture that guarantees model training independent of th…
▽ More
We present a data-driven framework for ground-motion synthesis that generates three-component acceleration time histories conditioned on moment magnitude, rupture distance , time-average shear-wave velocity at the top $30m$ ($V_{S30}$), and style of faulting. We use a Generative Adversarial Neural Operator (GANO), a resolution invariant architecture that guarantees model training independent of the data sampling frequency. We first present the conditional ground-motion synthesis algorithm (cGM-GANO) and discuss its advantages compared to previous work. We next train cGM-GANO on simulated ground motions generated by the Southern California Earthquake Center Broadband Platform (BBP) and on recorded KiK-net data and show that the model can learn the overall magnitude, distance, and $V_{S30}$ scaling of effective amplitude spectra (EAS) ordinates and pseudo-spectral accelerations (PSA). Results specifically show that cGM-GANO produces consistent median scaling with the training data for the corresponding tectonic environments over a wide range of frequencies for scenarios with sufficient data coverage. For the BBP dataset, cGM-GANO cannot learn the ground motion scaling of the stochastic frequency components; for the KiK-net dataset, the largest misfit is observed at short distances and for soft soil conditions due to the scarcity of such data. Except for these conditions, the aleatory variability of EAS and PSA are captured reasonably well. Lastly, cGM-GANO produces similar median scaling to traditional GMMs for frequencies greater than 1Hz for both PSA and EAS but underestimates the aleatory variability of EAS. Discrepancies in the comparisons between the synthetic ground motions and GMMs are attributed to inconsistencies between the training dataset and the datasets used in GMM development. Our pilot study demonstrates GANO's potential for efficient synthesis of broad-band ground motions
△ Less
Submitted 14 February, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Phase Neural Operator for Multi-Station Picking of Seismic Arrivals
Authors:
Hongyu Sun,
Zachary E. Ross,
Weiqiang Zhu,
Kamyar Azizzadenesheli
Abstract:
Seismic wave arrival time measurements form the basis for numerous downstream applications. State-of-the-art approaches for phase picking use deep neural networks to annotate seismograms at each station independently, yet human experts annotate seismic data by examining the whole network jointly. Here, we introduce a general-purpose network-wide phase picking algorithm based on a recently develope…
▽ More
Seismic wave arrival time measurements form the basis for numerous downstream applications. State-of-the-art approaches for phase picking use deep neural networks to annotate seismograms at each station independently, yet human experts annotate seismic data by examining the whole network jointly. Here, we introduce a general-purpose network-wide phase picking algorithm based on a recently developed machine learning paradigm called Neural Operator. Our model, called PhaseNO, leverages the spatio-temporal contextual information to pick phases simultaneously for any seismic network geometry. This results in superior performance over leading baseline algorithms by detecting many more earthquakes, picking more phase arrivals, while also greatly improving measurement accuracy. Following similar trends being seen across the domains of artificial intelligence, our approach provides but a glimpse of the potential gains from fully-utilizing the massive seismic datasets being collected worldwide.
△ Less
Submitted 30 November, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Seismic Arrival-time Picking on Distributed Acoustic Sensing Data using Semi-supervised Learning
Authors:
Weiqiang Zhu,
Ettore Biondi,
Jiaxuan Li,
Jiuxun Yin,
Zachary E. Ross,
Zhongwen Zhan
Abstract:
Distributed Acoustic Sensing (DAS) is an emerging technology for earthquake monitoring and subsurface imaging. The recorded seismic signals by DAS have several distinct characteristics, such as unknown coupling effects, strong anthropogenic noise, and ultra-dense spatial sampling. These aspects differ from conventional seismic data recorded by seismic networks, making it challenging to utilize DAS…
▽ More
Distributed Acoustic Sensing (DAS) is an emerging technology for earthquake monitoring and subsurface imaging. The recorded seismic signals by DAS have several distinct characteristics, such as unknown coupling effects, strong anthropogenic noise, and ultra-dense spatial sampling. These aspects differ from conventional seismic data recorded by seismic networks, making it challenging to utilize DAS at present for seismic monitoring. New data analysis algorithms are needed to extract useful information from DAS data. Previous studies on conventional seismic data demonstrated that deep learning models could achieve performance close to human analysts in picking seismic phases. However, phase picking on DAS data is still a difficult problem due to the lack of manual labels. Further, the differences in mathematical structure between these two data formats, i.e., ultra-dense DAS arrays and sparse seismic networks, make model fine-tuning or transfer learning difficult to implement on DAS data. In this work, we design a new approach using semi-supervised learning to solve the phase-picking task on DAS arrays. We use a pre-trained PhaseNet model as a teacher network to generate noisy labels of P and S arrivals on DAS data and apply the Gaussian mixture model phase association (GaMMA) method to refine these noisy labels to build training datasets. We develop a new deep learning model, PhaseNet-DAS, to process the 2D spatial-temporal data of DAS arrays and train the model on DAS data. The new deep learning model achieves high picking accuracy and good earthquake detection performance. We then apply the model to process continuous data and build earthquake catalogs directly from DAS recording. Our approach using semi-supervised learning provides a way to build effective deep learning models for DAS, which have the potential to improve earthquake monitoring using large-scale fiber networks.
△ Less
Submitted 14 March, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
A Deep Gaussian Process Model for Seismicity Background Rates
Authors:
Jack B. Muir,
Zachary E. Ross
Abstract:
The spatio-temporal properties of seismicity give us incisive insight into the stress state evolution and fault structures of the crust. Empirical models based on self-exciting point-processes continue to provide an important tool for analyzing seismicity, given the epistemic uncertainty associated with physical models. In particular, the epidemic-type aftershock sequence (ETAS) model acts as a re…
▽ More
The spatio-temporal properties of seismicity give us incisive insight into the stress state evolution and fault structures of the crust. Empirical models based on self-exciting point-processes continue to provide an important tool for analyzing seismicity, given the epistemic uncertainty associated with physical models. In particular, the epidemic-type aftershock sequence (ETAS) model acts as a reference model for studying seismicity catalogs. The traditional ETAS model uses simple parametric definitions for the background rate of triggering-independent seismicity. This reduces the effectiveness of the basic ETAS model in modelling the temporally complex seismicity patterns seen in seismic swarms that are dominated by aseismic tectonic processes such as fluid injection rather than aftershock triggering. In order to robustly capture time-varying seismicity rates, we introduce a deep Gaussian process formulation for the background rate as an extension to ETAS. Gaussian processes (GPs) are a robust non-parametric model for function spaces with covariance structure. By conditioning the lengthscale structure of a GP with another GP, we have a deep-GP: a probabilistic, hierarchical model that automatically tunes its structure to match data constraints. We show how the deep-GP-ETAS model can be efficiently sampled by making use of a Metropolis-within-Gibbs scheme, taking advantage of the branching process formulation of ETAS and a stochastic partial differential equation (SPDE) approximation for Matérn GPs. We illustrate our method using synthetic examples, and show that the deep-GP-ETAS model successfully captures multiscale temporal behavior in the background forcing rate of seismicity. We then apply the results to two real-data catalogues: the Ridgecrest, CA July 5 2019 Mw 7.1 event catalogue and the 2016--2019 Cahuilla, CA earthquake swarm.
△ Less
Submitted 13 February, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Neural mixture model association of seismic phases
Authors:
Zachary E. Ross,
Weiqiang Zhu,
Kamyar Azizzadenesheli
Abstract:
Seismic phase association is the task of grou** phase arrival picks across a seismic network into subsets with common origins. Building on recent successes in this area with machine learning tools, we introduce a neural mixture model association algorithm (Neuma), which incorporates physics-informed neural networks and mixture models to address this challenging problem. Our formulation assumes e…
▽ More
Seismic phase association is the task of grou** phase arrival picks across a seismic network into subsets with common origins. Building on recent successes in this area with machine learning tools, we introduce a neural mixture model association algorithm (Neuma), which incorporates physics-informed neural networks and mixture models to address this challenging problem. Our formulation assumes explicitly that a dataset contains real phase picks from earthquakes and noise picks resulting from phase picking mistakes and fake picks. The problem statement is then to assign each observation to either an earthquake or noise. We iteratively update a set of hypocenters and magnitudes while determining the optimal class assignment for each pick. We show that by using a physics-informed Eikonal solver as the forward model, we can impose stringent quality control on surviving picks while maintaining high recall. We evaluate the performance of Neuma against several baseline algorithms on a series of challenging synthetic datasets and the 2019 Ridgecrest, California sequence. Neuma outperforms the baselines in precision and recall for each of the synthetic datasets. Furthermore, it detects an additional 3285 more earthquakes than the best baseline on the Ridgecrest dataset (13.5%), while substantially improving the hypocenters.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
Accelerating Time-Reversal Imaging with Neural Operators for Real-time Earthquake Locations
Authors:
Hongyu Sun,
Yan Yang,
Kamyar Azizzadenesheli,
Robert W. Clayton,
Zachary E. Ross
Abstract:
Earthquake hypocenters form the basis for a wide array of seismological analyses. Pick-based earthquake location workflows rely on the accuracy of phase pickers and may be biased when dealing with complex earthquake sequences in heterogeneous media. Time-reversal imaging of passive seismic sources with the cross-correlation imaging condition has potential for earthquake location with high accuracy…
▽ More
Earthquake hypocenters form the basis for a wide array of seismological analyses. Pick-based earthquake location workflows rely on the accuracy of phase pickers and may be biased when dealing with complex earthquake sequences in heterogeneous media. Time-reversal imaging of passive seismic sources with the cross-correlation imaging condition has potential for earthquake location with high accuracy and high resolution, but carries a large computational cost. Here we present an alternative deep-learning approach for earthquake location by combining the benefits of neural operators for wave propagation and time reversal imaging with multi-station waveform recordings. A U-shaped neural operator is trained to propagate seismic waves with various source time functions and thus can predict a backpropagated wavefield for each station in negligible time. These wavefields can either be stacked or correlated to locate earthquakes from the resulting source images. Compared with other waveform-based deep-learning location methods, time reversal imaging accounts for physical laws of wave propagation and is expected to achieve accurate earthquake location. We demonstrate the method with the 2D acoustic wave equation on both synthetic and field data. The results show that our method can efficiently obtain high resolution and high accuracy correlation-based time reversal imaging of earthquake sources. Moreover, our approach is adaptable to the number and geometry of seismic stations, which opens new strategies for real-time earthquake location and monitoring with dense seismic networks.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Rapid Seismic Waveform Modeling and Inversion with Neural Operators
Authors:
Yan Yang,
Angela F. Gao,
Kamyar Azizzadenesheli,
Robert W. Clayton,
Zachary E. Ross
Abstract:
Seismic waveform modeling is a powerful tool for determining earth structure models and unraveling earthquake rupture processes, but it is usually computationally expensive. We introduce a scheme to vastly accelerate these calculations with a recently developed machine learning paradigm called the neural operator. Once trained, these models can simulate a full wavefield at negligible cost. We use…
▽ More
Seismic waveform modeling is a powerful tool for determining earth structure models and unraveling earthquake rupture processes, but it is usually computationally expensive. We introduce a scheme to vastly accelerate these calculations with a recently developed machine learning paradigm called the neural operator. Once trained, these models can simulate a full wavefield at negligible cost. We use a U-shaped neural operator to learn a general solution operator to the 2D elastic wave equation from an ensemble of numerical simulations performed with random velocity models and source locations. We show that full waveform modeling with neural operators is nearly two orders of magnitude faster than conventional numerical methods, and more importantly, the trained model enables accurate simulation for velocity models, source locations, and mesh discretization distinctly different from the training dataset. The method also enables convenient full-waveform inversion with automatic differentiation.
△ Less
Submitted 4 April, 2023; v1 submitted 24 September, 2022;
originally announced September 2022.
-
Generative Adversarial Neural Operators
Authors:
Md Ashiqur Rahman,
Manuel A. Florez,
Anima Anandkumar,
Zachary E. Ross,
Kamyar Azizzadenesheli
Abstract:
We propose the generative adversarial neural operator (GANO), a generative model paradigm for learning probabilities on infinite-dimensional function spaces. The natural sciences and engineering are known to have many types of data that are sampled from infinite-dimensional function spaces, where classical finite-dimensional deep generative adversarial networks (GANs) may not be directly applicabl…
▽ More
We propose the generative adversarial neural operator (GANO), a generative model paradigm for learning probabilities on infinite-dimensional function spaces. The natural sciences and engineering are known to have many types of data that are sampled from infinite-dimensional function spaces, where classical finite-dimensional deep generative adversarial networks (GANs) may not be directly applicable. GANO generalizes the GAN framework and allows for the sampling of functions by learning push-forward operator maps in infinite-dimensional spaces. GANO consists of two main components, a generator neural operator and a discriminator neural functional. The inputs to the generator are samples of functions from a user-specified probability measure, e.g., Gaussian random field (GRF), and the generator outputs are synthetic data functions. The input to the discriminator is either a real or synthetic data function. In this work, we instantiate GANO using the Wasserstein criterion and show how the Wasserstein loss can be computed in infinite-dimensional spaces. We empirically study GANO in controlled cases where both input and output functions are samples from GRFs and compare its performance to the finite-dimensional counterpart GAN. We empirically study the efficacy of GANO on real-world function data of volcanic activities and show its superior performance over GAN.
△ Less
Submitted 12 October, 2022; v1 submitted 6 May, 2022;
originally announced May 2022.
-
U-NO: U-shaped Neural Operators
Authors:
Md Ashiqur Rahman,
Zachary E. Ross,
Kamyar Azizzadenesheli
Abstract:
Neural operators generalize classical neural networks to maps between infinite-dimensional spaces, e.g., function spaces. Prior works on neural operators proposed a series of novel methods to learn such maps and demonstrated unprecedented success in learning solution operators of partial differential equations. Due to their close proximity to fully connected architectures, these models mainly suff…
▽ More
Neural operators generalize classical neural networks to maps between infinite-dimensional spaces, e.g., function spaces. Prior works on neural operators proposed a series of novel methods to learn such maps and demonstrated unprecedented success in learning solution operators of partial differential equations. Due to their close proximity to fully connected architectures, these models mainly suffer from high memory usage and are generally limited to shallow deep learning models. In this paper, we propose U-shaped Neural Operator (U-NO), a U-shaped memory enhanced architecture that allows for deeper neural operators. U-NOs exploit the problem structures in function predictions and demonstrate fast training, data efficiency, and robustness with respect to hyperparameters choices. We study the performance of U-NO on PDE benchmarks, namely, Darcy's flow law and the Navier-Stokes equations. We show that U-NO results in an average of 26% and 44% prediction improvement on Darcy's flow and turbulent Navier-Stokes equations, respectively, over the state of the art. On Navier-Stokes 3D spatiotemporal operator learning task, we show U-NO provides 37% improvement over the state of art methods.
△ Less
Submitted 5 May, 2023; v1 submitted 23 April, 2022;
originally announced April 2022.
-
Seismic wave propagation and inversion with Neural Operators
Authors:
Yan Yang,
Angela F. Gao,
Jorge C. Castellanos,
Zachary E. Ross,
Kamyar Azizzadenesheli,
Robert W. Clayton
Abstract:
Seismic wave propagation forms the basis for most aspects of seismological research, yet solving the wave equation is a major computational burden that inhibits the progress of research. This is exacerbated by the fact that new simulations must be performed when the velocity structure or source location is perturbed. Here, we explore a prototype framework for learning general solutions using a rec…
▽ More
Seismic wave propagation forms the basis for most aspects of seismological research, yet solving the wave equation is a major computational burden that inhibits the progress of research. This is exacerbated by the fact that new simulations must be performed when the velocity structure or source location is perturbed. Here, we explore a prototype framework for learning general solutions using a recently developed machine learning paradigm called Neural Operator. A trained Neural Operator can compute a solution in negligible time for any velocity structure or source location. We develop a scheme to train Neural Operators on an ensemble of simulations performed with random velocity models and source locations. As Neural Operators are grid-free, it is possible to evaluate solutions on higher resolution velocity models than trained on, providing additional computational efficiency. We illustrate the method with the 2D acoustic wave equation and demonstrate the method's applicability to seismic tomography, using reverse mode automatic differentiation to compute gradients of the wavefield with respect to the velocity structure. The developed procedure is nearly an order of magnitude faster than using conventional numerical methods for full waveform inversion.
△ Less
Submitted 13 October, 2021; v1 submitted 11 August, 2021;
originally announced August 2021.
-
Deep Learning-based Damage Map** with InSAR Coherence Time Series
Authors:
Oliver L. Stephenson,
Tobias Köhne,
Eric Zhan,
Brent E. Cahill,
Sang-Ho Yun,
Zachary E. Ross,
Mark Simons
Abstract:
Satellite remote sensing is playing an increasing role in the rapid map** of damage after natural disasters. In particular, synthetic aperture radar (SAR) can image the Earth's surface and map damage in all weather conditions, day and night. However, current SAR damage map** methods struggle to separate damage from other changes in the Earth's surface. In this study, we propose a novel approac…
▽ More
Satellite remote sensing is playing an increasing role in the rapid map** of damage after natural disasters. In particular, synthetic aperture radar (SAR) can image the Earth's surface and map damage in all weather conditions, day and night. However, current SAR damage map** methods struggle to separate damage from other changes in the Earth's surface. In this study, we propose a novel approach to damage map**, combining deep learning with the full time history of SAR observations of an impacted region in order to detect anomalous variations in the Earth's surface properties due to a natural disaster. We quantify Earth surface change using time series of Interferometric SAR coherence, then use a recurrent neural network (RNN) as a probabilistic anomaly detector on these coherence time series. The RNN is first trained on pre-event coherence time series, and then forecasts a probability distribution of the coherence between pre- and post-event SAR images. The difference between the forecast and observed co-event coherence provides a measure of the confidence in the identification of damage. The method allows the user to choose a damage detection threshold that is customized for each location, based on the local behavior of coherence through time before the event. We apply this method to calculate estimates of damage for three earthquakes using multi-year time series of Sentinel-1 SAR acquisitions. Our approach shows good agreement with observed damage and quantitative improvement compared to using pre- to co-event coherence loss as a damage proxy.
△ Less
Submitted 24 May, 2021;
originally announced May 2021.
-
HypoSVI: Hypocenter inversion with Stein variational inference and Physics Informed Neural Networks
Authors:
Jonathan D. Smith,
Zachary E. Ross,
Kamyar Azizzadenesheli,
Jack B. Muir
Abstract:
We introduce a scheme for probabilistic hypocenter inversion with Stein variational inference. Our approach uses a differentiable forward model in the form of a physics informed neural network, which we train to solve the Eikonal equation. This allows for rapid approximation of the posterior by iteratively optimizing a collection of particles against a kernelized Stein discrepancy. We show that th…
▽ More
We introduce a scheme for probabilistic hypocenter inversion with Stein variational inference. Our approach uses a differentiable forward model in the form of a physics informed neural network, which we train to solve the Eikonal equation. This allows for rapid approximation of the posterior by iteratively optimizing a collection of particles against a kernelized Stein discrepancy. We show that the method is well-equipped to handle highly multimodal posterior distributions, which are common in hypocentral inverse problems. A suite of experiments is performed to examine the influence of the various hyperparameters. Once trained, the method is valid for any seismic network geometry within the study area without the need to build travel time tables. We show that the computational demands scale efficiently with the number of differential times, making it ideal for large-N sensing technologies like Distributed Acoustic Sensing. The techniques outlined in this manuscript have considerable implications beyond just ray-tracing procedures, with the work flow applicable to other fields with computationally expensive inversion procedures such as full waveform inversion.
△ Less
Submitted 17 August, 2022; v1 submitted 8 January, 2021;
originally announced January 2021.
-
Data-driven Accelerogram Synthesis using Deep Generative Models
Authors:
Manuel A. Florez,
Michaelangelo Caporale,
Pakpoom Buabthong,
Zachary E. Ross,
Domniki Asimaki,
Men-Andrin Meier
Abstract:
Robust estimation of ground motions generated by scenario earthquakes is critical for many engineering applications. We leverage recent advances in Generative Adversarial Networks (GANs) to develop a new framework for synthesizing earthquake acceleration time histories. Our approach extends the Wasserstein GAN formulation to allow for the generation of ground-motions conditioned on a set of contin…
▽ More
Robust estimation of ground motions generated by scenario earthquakes is critical for many engineering applications. We leverage recent advances in Generative Adversarial Networks (GANs) to develop a new framework for synthesizing earthquake acceleration time histories. Our approach extends the Wasserstein GAN formulation to allow for the generation of ground-motions conditioned on a set of continuous physical variables. Our model is trained to approximate the intrinsic probability distribution of a massive set of strong-motion recordings from Japan. We show that the trained generator model can synthesize realistic 3-Component accelerograms conditioned on magnitude, distance, and $V_{s30}$. Our model captures the expected statistical features of the acceleration spectra and waveform envelopes. The output seismograms display clear P and S-wave arrivals with the appropriate energy content and relative onset timing. The synthesized Peak Ground Acceleration (PGA) estimates are also consistent with observations. We develop a set of metrics that allow us to assess the training process's stability and tune model hyperparameters. We further show that the trained generator network can interpolate to conditions where no earthquake ground motion recordings exist. Our approach allows the on-demand synthesis of accelerograms for engineering purposes.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
EikoNet: Solving the Eikonal equation with Deep Neural Networks
Authors:
Jonathan D. Smith,
Kamyar Azizzadenesheli,
Zachary E. Ross
Abstract:
The recent deep learning revolution has created an enormous opportunity for accelerating compute capabilities in the context of physics-based simulations. Here, we propose EikoNet, a deep learning approach to solving the Eikonal equation, which characterizes the first-arrival-time field in heterogeneous 3D velocity structures. Our grid-free approach allows for rapid determination of the travel tim…
▽ More
The recent deep learning revolution has created an enormous opportunity for accelerating compute capabilities in the context of physics-based simulations. Here, we propose EikoNet, a deep learning approach to solving the Eikonal equation, which characterizes the first-arrival-time field in heterogeneous 3D velocity structures. Our grid-free approach allows for rapid determination of the travel time between any two points within a continuous 3D domain. These travel time solutions are allowed to violate the differential equation - which casts the problem as one of optimization - with the goal of finding network parameters that minimize the degree to which the equation is violated. In doing so, the method exploits the differentiability of neural networks to calculate the spatial gradients analytically, meaning the network can be trained on its own without ever needing solutions from a finite difference algorithm. EikoNet is rigorously tested on several velocity models and sampling methods to demonstrate robustness and versatility. Training and inference are highly parallelized, making the approach well-suited for GPUs. EikoNet has low memory overhead, and further avoids the need for travel-time lookup tables. The developed approach has important applications to earthquake hypocenter inversion, ray multi-pathing, and tomographic modeling, as well as to other fields beyond seismology where ray tracing is essential.
△ Less
Submitted 11 August, 2020; v1 submitted 24 March, 2020;
originally announced April 2020.
-
Extracting dispersion curves from ambient noise correlations using deep learning
Authors:
Xiaotian Zhang,
Zhe Jia,
Zachary E. Ross,
Robert W. Clayton
Abstract:
We present a machine-learning approach to classifying the phases of surface wave dispersion curves. Standard FTAN analysis of surfaces observed on an array of receivers is converted to an image, of which, each pixel is classified as fundamental mode, first overtone, or noise. We use a convolutional neural network (U-net) architecture with a supervised learning objective and incorporate transfer le…
▽ More
We present a machine-learning approach to classifying the phases of surface wave dispersion curves. Standard FTAN analysis of surfaces observed on an array of receivers is converted to an image, of which, each pixel is classified as fundamental mode, first overtone, or noise. We use a convolutional neural network (U-net) architecture with a supervised learning objective and incorporate transfer learning. The training is initially performed with synthetic data to learn coarse structure, followed by fine-tuning of the network using approximately 10% of the real data based on human classification. The results show that the machine classification is nearly identical to the human picked phases. Expanding the method to process multiple images at once did not improve the performance. The developed technique will faciliate automated processing of large dispersion curve datasets.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
Directivity Modes of Earthquake Populations with Unsupervised Learning
Authors:
Zachary E. Ross,
Daniel T. Trugman,
Kamyar Azizzadenesheli,
Anima Anandkumar
Abstract:
We present a novel approach for resolving modes of rupture directivity in large populations of earthquakes. A seismic spectral decomposition technique is used to first produce relative measurements of radiated energy for earthquakes in a spatially-compact cluster. The azimuthal distribution of energy for each earthquake is then assumed to result from one of several distinct modes of rupture propag…
▽ More
We present a novel approach for resolving modes of rupture directivity in large populations of earthquakes. A seismic spectral decomposition technique is used to first produce relative measurements of radiated energy for earthquakes in a spatially-compact cluster. The azimuthal distribution of energy for each earthquake is then assumed to result from one of several distinct modes of rupture propagation. Rather than fitting a kinematic rupture model to determine the most likely mode of rupture propagation, we instead treat the modes as latent variables and learn them with a Gaussian mixture model. The mixture model simultaneously determines the number of events that best identify with each mode. The technique is demonstrated on four datasets in California with several thousand earthquakes. We show that the datasets naturally decompose into distinct rupture propagation modes that correspond to different rupture directions, and the fault plane is unambiguously identified for all cases. We find that these small earthquakes exhibit unilateral ruptures 53-74% of the time on average. The results provide important observational constraints on the physics of earthquakes and faults.
△ Less
Submitted 30 June, 2019;
originally announced July 2019.
-
Reliable Real-time Seismic Signal/Noise Discrimination with Machine Learning
Authors:
Men-Andrin Meier,
Zachary E. Ross,
Anshul Ramachandran,
Ashwin Balakrishna,
Suraj Nair,
Peter Kundzicz,
Zefeng Li,
Jennifer Andrews,
Egill Hauksson,
Yisong Yue
Abstract:
In Earthquake Early Warning (EEW), every sufficiently impulsive signal is potentially the first evidence for an unfolding large earthquake. More often than not, however, impulsive signals are mere nuisance signals. One of the most fundamental - and difficult - tasks in EEW is to rapidly and reliably discriminate real local earthquake signals from all other signals. This discrimination is necessari…
▽ More
In Earthquake Early Warning (EEW), every sufficiently impulsive signal is potentially the first evidence for an unfolding large earthquake. More often than not, however, impulsive signals are mere nuisance signals. One of the most fundamental - and difficult - tasks in EEW is to rapidly and reliably discriminate real local earthquake signals from all other signals. This discrimination is necessarily based on very little information, typically a few seconds worth of seismic waveforms from a small number of stations. As a result, current EEW systems struggle to avoid discrimination errors, and suffer from false and missed alerts. In this study we show how modern machine learning classifiers can strongly improve real-time signal/noise discrimination. We develop and compare a series of non-linear classifiers with variable architecture depths, including fully connected, convolutional (CNN) and recurrent neural networks, and a model that combines a generative adversarial network with a random forest (GAN+RF). We train all classifiers on the same data set, which includes 374k local earthquake records (M3.0-9.1) and 946k impulsive noise signals. We find that all classifiers outperform existing simple linear classifiers, and that complex models trained directly on the raw signals yield the greatest degree of improvement. Using 3s long waveform snippets, the CNN and the GAN+RF classifiers both reach 99.5% precision and 99.3% recall on an independent validation data set. Most misclassifications stem from impulsive teleseismic records, and from incorrectly labeled records in the data set. Our results suggest that machine learning classifiers can strongly improve the reliability and speed of EEW alerts.
△ Less
Submitted 10 January, 2019;
originally announced January 2019.
-
PhaseLink: A Deep Learning Approach to Seismic Phase Association
Authors:
Zachary E. Ross,
Yisong Yue,
Men-Andrin Meier,
Egill Hauksson,
Thomas H. Heaton
Abstract:
Seismic phase association is a fundamental task in seismology that pertains to linking together phase detections on different sensors that originate from a common earthquake. It is widely employed to detect earthquakes on permanent and temporary seismic networks, and underlies most seismicity catalogs produced around the world. This task can be challenging because the number of sources is unknown,…
▽ More
Seismic phase association is a fundamental task in seismology that pertains to linking together phase detections on different sensors that originate from a common earthquake. It is widely employed to detect earthquakes on permanent and temporary seismic networks, and underlies most seismicity catalogs produced around the world. This task can be challenging because the number of sources is unknown, events frequently overlap in time, or can occur simultaneously in different parts of a network. We present PhaseLink, a framework based on recent advances in deep learning for grid-free earthquake phase association. Our approach learns to link phases together that share a common origin, and is trained entirely on tens of millions of synthetic sequences of P- and S-wave arrival times generated using a simple 1D velocity model. Our approach is simple to implement for any tectonic regime, suitable for real-time processing, and can naturally incorporate errors in arrival time picks. Rather than tuning a set of ad hoc hyperparameters to improve performance, PhaseLink can be improved by simply adding examples of problematic cases to the training dataset. We demonstrate the state-of-the-art performance of PhaseLink on a challenging recent sequence from southern California, and synthesized sequences from Japan designed to test the point at which the method fails. For the examined datasets, PhaseLink can precisely associate P- and S-picks to events that are separated by ~12 seconds in origin time. This approach is expected to improve the resolution of seismicity catalogs, add stability to real-time seismic monitoring, and streamline automated processing of large seismic datasets.
△ Less
Submitted 10 January, 2019; v1 submitted 8 September, 2018;
originally announced September 2018.
-
Generalized Seismic Phase Detection with Deep Learning
Authors:
Zachary E. Ross,
Men-Andrin Meier,
Egill Hauksson,
Thomas H. Heaton
Abstract:
To optimally monitor earthquake-generating processes, seismologists have sought to lower detection sensitivities ever since instrumental seismic networks were started about a century ago. Recently, it has become possible to search continuous waveform archives for replicas of previously recorded events (template matching), which has led to at least an order of magnitude increase in the number of de…
▽ More
To optimally monitor earthquake-generating processes, seismologists have sought to lower detection sensitivities ever since instrumental seismic networks were started about a century ago. Recently, it has become possible to search continuous waveform archives for replicas of previously recorded events (template matching), which has led to at least an order of magnitude increase in the number of detected earthquakes and greatly sharpened our view of geological structures. Earthquake catalogs produced in this fashion, however, are heavily biased in that they are completely blind to events for which no templates are available, such as in previously quiet regions or for very large magnitude events. Here we show that with deep learning we can overcome such biases without sacrificing detection sensitivity. We trained a convolutional neural network (ConvNet) on the vast hand-labeled data archives of the Southern California Seismic Network to detect seismic body wave phases. We show that the ConvNet is extremely sensitive and robust in detecting phases, even when masked by high background noise, and when the ConvNet is applied to new data that is not represented in the training set (in particular, very large magnitude events). This generalized phase detection (GPD) framework will significantly improve earthquake monitoring and catalogs, which form the underlying basis for a wide range of basic and applied seismological research.
△ Less
Submitted 10 January, 2019; v1 submitted 2 May, 2018;
originally announced May 2018.
-
P-wave arrival picking and first-motion polarity determination with deep learning
Authors:
Zachary E. Ross,
Men-Andrin Meier,
Egill Hauksson
Abstract:
Determining earthquake hypocenters and focal mechanisms requires precisely measured P-wave arrival times and first-motion polarities. Automated algorithms for estimating these quantities have been less accurate than estimates by human experts, which is problematic for processing large data volumes. Here, we train convolutional neural networks to measure both quantities, which learn directly from s…
▽ More
Determining earthquake hypocenters and focal mechanisms requires precisely measured P-wave arrival times and first-motion polarities. Automated algorithms for estimating these quantities have been less accurate than estimates by human experts, which is problematic for processing large data volumes. Here, we train convolutional neural networks to measure both quantities, which learn directly from seismograms without the need for feature extraction. The networks are trained on 18.2 million manually picked seismograms for the southern California region. Through cross-validation on 1.2 million independent seismograms, the differences between the automated and manual picks have a standard deviation of 0.023 seconds. The polarities determined by the classifier have a precision of 95% when compared with analyst-determined polarities. We show that the classifier picks more polarities overall than the analysts, without sacrificing quality, resulting in almost double the number of focal mechanisms. The remarkable precision of the trained networks indicates that they can perform as well, or better, than expert seismologists.
△ Less
Submitted 23 April, 2018;
originally announced April 2018.